{"id":298,"date":"2017-08-24T07:08:29","date_gmt":"2017-08-24T07:08:29","guid":{"rendered":"http:\/\/americanboard.org\/Subjects\/mathematics\/?page_id=298"},"modified":"2017-09-13T08:03:47","modified_gmt":"2017-09-13T08:03:47","slug":"beginning-statistics","status":"publish","type":"page","link":"https:\/\/americanboard.org\/Subjects\/mathematics\/beginning-statistics\/","title":{"rendered":"Beginning Statistics"},"content":{"rendered":"<div class=\"twelve columns\" style=\"margin-top: 10%;\">\n<div class=\"advance\"><a class=\"button button-primary\" href=\"http:\/\/americanboard.org\/Subjects\/mathematics\/independent-and-dependent-events\">\u2b05 Previous Lesson<\/a>\u00a0<a class=\"button\" href=\"http:\/\/americanboard.org\/Subjects\/mathematics\/probability-statistics-data-analysis\">Workshop Index<\/a>\u00a0<a class=\"button button-primary\" href=\"http:\/\/americanboard.org\/Subjects\/mathematics\/data-displays-normal-distributions-and-lines-of-best-fit\">Next\u00a0Lesson \u27a1<\/a><\/div>\n<p><!-- CONTENT BEGINS HERE --><\/p>\n<h1 id=\"title\">Beginning Statistics<\/h1>\n<h4>Objective<\/h4>\n<p>In this lesson, you will study how to compute the mean, mode, median, variance, and standard deviation of a\u00a0distribution of data. We will also see how to determine the minimum, maximum, upper quartile, and lower quartile of\u00a0the data set and use this information to display the data set in a box-and-whisker plot.<\/p>\n<section>\n<h4>What is statistics?<\/h4>\n<p>Statistics is a mathematical discipline concerned with\u00a0collecting, organizing, and interpreting data. It is closely\u00a0related to probability. When evaluating probability, we take\u00a0information about an event&#8217;s possible outcome and calculate the\u00a0likelihood of that outcome occurring. In statistics, we take a\u00a0<abbr title=\"a collection of values representing a population. It is usually represented v.\">data set<\/abbr> from a\u00a0population and extrapolate information about that population.\u00a0Using this information, we can make educated guesses about what\u00a0will happen in the future or learn more about a certain aspect of\u00a0the population that makes up our data set. Statistics plays an\u00a0important role in many fields, including the social and biological\u00a0sciences.<\/p>\n<p>A <strong><em>data set<\/em>, <\/strong>also called a data\u00a0distribution, is a collection of numbers representing one property\u00a0of a population. A given data set can be a quite large, presenting\u00a0an overwhelming collection of numbers or properties. When\u00a0analyzing a data set, we can focus on an important feature of the\u00a0set to help simplify and summarize the information it contains.\u00a0The most important of these features are listed below.<\/p>\n<ul>\n<li>The <abbr title=\"the average value of a data set. It is defined by c.\">mean<\/abbr> of a data set is the average value.<\/li>\n<li>The <abbr title=\"the value that occurs most often in a data set\">mode<\/abbr> of a data set is the value that occurs most often.<\/li>\n<li>The <abbr title=\"the difference between the extreme minimum and maximum of a data set\">range<\/abbr> is the difference between the largest value in the data set and\u00a0the smallest value in the data set.<\/li>\n<li>The <abbr title=\"the middle number of a data set\">median<\/abbr> of a data set is the number that falls in the middle of the data\u00a0set when all values the data set are listed in increasing order.<\/li>\n<\/ul>\n<p>Data sets are usually enclosed in braces, or curly brackets {\u00a0}. A data set can be made up of numbers or words, but the elements\u00a0of a data set are usually numbers. The tools and ter<span style=\"text-decoration: none;\">minology\u00a0in this lesson apply to numerical data sets and <\/span><em><span style=\"text-decoration: none;\">every\u00a0<\/span><\/em><span style=\"text-decoration: none;\">data set we will\u00a0consider in this lesson will be a numerical one. <\/span><\/p>\n<p><strong>Examples of data sets<\/strong><\/p>\n<ul>\n<li>the final exam grades for a particular class<br \/>\n<blockquote><p>{98, 88, 75, 93, 92, 88, 68, 95, 100, 87, 90, 73}<\/p><\/blockquote>\n<\/li>\n<li>the annual profit for a group of companies<br \/>\n<blockquote><p>{$1,250,000, $750,000, $900,000, $1,100,00,\u00a0$1,500,000, $1,100,000, $2,300,000, $5,500,000}<\/p><\/blockquote>\n<\/li>\n<li>the ages of people in a room<br \/>\n<blockquote><p>{5, 34, 26, 26, 19, 21, 35, 57, 23, 34, 28, 29}<\/p><\/blockquote>\n<\/li>\n<li>the amount of money spent each month by each member of a\u00a0group<br \/>\n<blockquote><p>{$55, $75, $115, $35, $61, $80, $101, $54, $120}<\/p><\/blockquote>\n<\/li>\n<li>the amount of time (in minutes) of a person\u2019s\u00a0telephone calls<br \/>\n<blockquote><p>{2.5, 0.9, 1.9, 12, 4.5, 25, 14, 45, 90, 5.25, 8.0,\u00a0102}<\/p><\/blockquote>\n<\/li>\n<\/ul>\n<h3>Beginning Statistics<\/h3>\n<p>Let&#8217;s try an example. The data set {84, 92, 88, 91, 95, 92,\u00a0100, 96, 99} contains a student&#8217;s test grades in her history\u00a0course. For this data set, find the:<\/p>\n<ul>\n<li>mean,<\/li>\n<li>mode,<\/li>\n<li>median, and<\/li>\n<li>range.<\/li>\n<\/ul>\n<p>The mean is the average of all the elements in the data set.<\/p>\n<p align=\"CENTER\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/s5_p3_clip_image003.gif\" width=\"575\" height=\"34\" name=\"graphics3\" align=\"BOTTOM\" border=\"0\" \/><\/p>\n<p>The mode is the value that appears most frequently. In this\u00a0data set, the mode is 92.<\/p>\n<div class=\"callout\">\n<h4>Be Aware!<\/h4>\n<p>The mode of a data set can be more than one number. The mode is defined\u00a0as the number in a data set that occurs most often, a definition that seems like it can only apply to\u00a0one number. However, consider the set {3, 15, 6, 23, 17, 15, 9, 11, 23, 2}. In this set, both 15 and 23\u00a0occur twice, while all of the other numbers occur only once. Because 15 and 23 recur an equal number of\u00a0times, the mode of the set is 15 <em>and<\/em> 23.<\/p>\n<p class=\"notebox_text\" align=\"left\"><span style=\"text-decoration: none;\">If a data set has two, or three, or four numbers that occur most often, the mode is all of those numbers. If <\/span><em><span style=\"text-decoration: none;\">none<\/span><\/em><span style=\"text-decoration: none;\"> of the numbers in the data set occur more than once, the set has no mode. <\/span><\/p>\n<\/div>\n<p>The median is the number that falls in the middle of the data\u00a0set. More specifically, the median has an equal number of data set\u00a0elements above it and below it. When determining the median, it\u00a0helps to rewrite the data set with its elements in increasing\u00a0order. In this case, the rewritten data set is {84, 88, 91, 92,\u00a092, 95, 96, 99, 100}. Then determine which number lies in the\u00a0middle of the data set. The median of this data set is 92.<\/p>\n<div class=\"callout\">\n<h4>Be Aware!<\/h4>\n<p>When determining the median of a data set, always rearrange the set so\u00a0that its elements are in increasing order. This is the best way to avoid careless error.<\/p>\n<p class=\"notebox_text\" align=\"LEFT\"><span style=\"text-decoration: none;\">For a data set with an <\/span><em><span style=\"text-decoration: none;\">odd <\/span><\/em><span style=\"text-decoration: none;\">number of elements, the median has the same number of elements above and below it. <\/span><\/p>\n<p class=\"notebox_text\" align=\"LEFT\"><span style=\"text-decoration: none;\">For a data set with an <\/span><em><span style=\"text-decoration: none;\">even<\/span><\/em><span style=\"text-decoration: none;\"> number of elements, the median is the <\/span><em><span style=\"text-decoration: none;\">average<\/span><\/em><span style=\"text-decoration: none;\"> of the two central elements. <\/span><\/p>\n<p class=\"notebox_text\" style=\"text-decoration: none;\" align=\"LEFT\">For example, the median of the set {1, 4,\u00a07, 19, 24, 28, 30, 42} is <img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/s5_p3_clip_image006.gif\" width=\"69\" height=\"20\" name=\"graphics4\" align=\"ABSMIDDLE\" border=\"0\" \/>.<\/p>\n<\/div>\n<p>The range is the difference between a set&#8217;s absolute minimum\u00a0(the smallest value in the set) and its absolute maximum (the\u00a0largest value in the set). The range for the set {84, 92, 88, 91,\u00a095, 92, 100, 96, 99} is 100 \u2013 84 = 16.<\/p>\n<h3><strong>How do we interpret the properties of a data set?<\/strong><\/h3>\n<p>The data set in our example is not very large; a quick glance\u00a0over the set gives us a good idea of the student&#8217;s performance in\u00a0her history class. However, when the data set is very large\u00a0(containing hundreds, maybe thousands, of numbers), we cannot draw\u00a0good conclusions by glancing over the data. However, the four\u00a0features of a set mentioned above, <em>mean, mode, range, and\u00a0median<\/em>, can help us form a general idea, or summary, of the\u00a0information in a data set. These values will tell us the average\u00a0value, the number that occurs most often, the number that the\u00a0values of the data set are centered around, and the spread between\u00a0the greatest value in the data set and the least value in the data\u00a0set. This information can be a great help in analyzing or\u00a0simplifying complicated data sets.<\/p>\n<h3>What are variance and standard deviation?<\/h3>\n<p>The mean, mode, range, and median are important values for a\u00a0data set. However, a lot of information is lost when we reduce a\u00a0large data set to these four numbers. Furthermore, these values\u00a0are clearly not unique to a particular data set. Two very\u00a0different data sets could have the same mean, mode, range, or\u00a0median. To understand the elements of a data set better, we will\u00a0introduce two more values associated with a data set.<\/p>\n<p>Let our data set be represented by\u00a0<img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/s5_p5_clip_image003.gif\" width=\"109\" height=\"16\" name=\"graphics3\" align=\"ABSMIDDLE\" border=\"0\" \/>.\u00a0The mean is symbolized by\u00a0<img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/s5_p5_clip_image006.gif\" width=\"7\" height=\"17\" name=\"graphics4\" align=\"BOTTOM\" border=\"0\" \/>\u00a0and defined by the formula\u00a0<img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/s5_p5_clip_image009.gif\" width=\"152\" height=\"38\" name=\"graphics5\" align=\"ABSMIDDLE\" border=\"0\" \/>.<\/p>\n<p>Th<span style=\"text-decoration: none;\">e <\/span><abbr title=\" a measure of the dispersion of elements of a data set. It is defined by sss. \"><span style=\"text-decoration: none;\">variance<\/span><\/abbr><span style=\"text-decoration: none;\"> <img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/s5_p5_clip_image012.gif\" width=\"14\" height=\"14\" name=\"graphics6\" align=\"BOTTOM\" border=\"0\" \/>\u00a0is a positive number <\/span>defin<span style=\"text-decoration: none;\">ed\u00a0by\u00a0<img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/s5_p5_clip_image015.gif\" width=\"112\" height=\"38\" name=\"graphics7\" align=\"ABSMIDDLE\" border=\"0\" \/>.<\/span><\/p>\n<div class=\"callout\">\n<h4>Be Aware!<\/h4>\n<p><span style=\"text-decoration: none;\">There is another, slightly different way to define variance. This other way is sometimes called the <\/span><em><span style=\"text-decoration: none;\">unbiased variance<\/span><\/em><span style=\"text-decoration: none;\">. The formula for unbiased variance is <img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/s5_p5_clip_image018.gif\" width=\"123\" height=\"38\" name=\"graphics8\" align=\"ABSMIDDLE\" border=\"0\" \/>. Notice that, in this formula, we divide the summation by <\/span><em><span style=\"text-decoration: none;\">n<\/span><\/em><span style=\"text-decoration: none;\">-1 instead of <\/span><em><span style=\"text-decoration: none;\">n<\/span><\/em><span style=\"text-decoration: none;\">. Many textbooks use this form of variance and simply call it \u201cvariance.\u201d Be aware of this, and pay close attention to which version you are dealing with any given case. <\/span><\/p>\n<\/div>\n<h3>Beginning Statistics<\/h3>\n<p>The variance formula tells us to average the square of the\u00a0differences between all the values of the data set and the mean\u00a0value. The resulting value is the variance and it measures how\u00a0spread out or scattered the elements of the data set are. The\u00a0larger the variance (<img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/s5_p6_clip_image003.gif\" width=\"14\" height=\"14\" name=\"graphics3\" align=\"BOTTOM\" border=\"0\" \/>)\u00a0is, the more scattered the elements of the data set. The variance\u00a0is zero only if each value equals the mean.<\/p>\n<p><span style=\"text-decoration: none;\">The <\/span><abbr title=\"the square root of the variance. It is defined by s. \"><span style=\"text-decoration: none;\">standard\u00a0deviation<\/span><\/abbr> <span style=\"text-decoration: none;\"> <img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/s5_p6_clip_image006.gif\" width=\"11\" height=\"8\" name=\"graphics4\" align=\"BOTTOM\" border=\"0\" \/>\u00a0is the square root of the variance. Because the variance is always\u00a0a positive number, we do not have to worry about taking the square\u00a0root of a negative number. Much like the variance, the standard\u00a0deviation measures the dispersion of the elements of a data set.\u00a0The larger the standard deviation is, the greater the spread among\u00a0the elements of the data set. Standard deviation is defined by\u00a0<img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/s5_p6_clip_image009.gif\" width=\"119\" height=\"46\" name=\"graphics5\" align=\"ABSMIDDLE\" border=\"0\" \/>.<br \/>\n<\/span><\/p>\n<section class=\"question\">\n<h4>Question<\/h4>\n<div>\n<p>Consider the four data sets below.<\/p>\n<p align=\"CENTER\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/s5_p6_clip_image012.gif\" width=\"133\" height=\"109\" name=\"graphics6\" align=\"BOTTOM\" border=\"0\" \/><\/p>\n<p><a name=\"one\"><\/a>Which data set has the largest standard\u00a0deviation?<\/p>\n<ol>\n<li><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/s5_p6_clip_image015.gif\" width=\"15\" height=\"16\" name=\"graphics7\" align=\"BOTTOM\" border=\"0\" \/><\/li>\n<li><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/s5_p6_clip_image018.gif\" width=\"16\" height=\"16\" name=\"graphics8\" align=\"BOTTOM\" border=\"0\" \/><\/li>\n<li><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/s5_p6_clip_image021.gif\" width=\"16\" height=\"16\" name=\"graphics9\" align=\"BOTTOM\" border=\"0\" \/><\/li>\n<li><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/s5_p6_clip_image024.gif\" width=\"16\" height=\"16\" name=\"graphics10\" align=\"BOTTOM\" border=\"0\" \/><\/li>\n<\/ol>\n<\/div>\n<p><a class=\"button button-primary q-answer\"> Reveal Answer <\/a><\/p>\n<div class=\"q-reveal\">\n<p>The correct choice is C. First, notice that each data set has\u00a0the same mean value:\u00a0<img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/s5_p6_clip_image027.gif\" width=\"31\" height=\"17\" name=\"graphics11\" align=\"ABSMIDDLE\" border=\"0\" \/>.\u00a0In this situation, the mean value does not give us very much\u00a0information\u2014it certainly does not distinguish the data sets\u00a0from one another. However, their standard deviations are all\u00a0different, as we will see below.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/s5_p5_html_37daa5b0.gif\" width=\"567\" height=\"99\" name=\"graphics12\" align=\"BOTTOM\" border=\"0\" \/><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/s5_p5_html_m60a761d2.gif\" width=\"505\" height=\"99\" name=\"graphics13\" align=\"BOTTOM\" border=\"0\" \/><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/s5_p5_html_5865f8e3.gif\" width=\"559\" height=\"99\" name=\"graphics14\" align=\"BOTTOM\" border=\"0\" \/><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/s5_p5_html_35fcdeb8.gif\" width=\"497\" height=\"99\" name=\"graphics15\" align=\"BOTTOM\" border=\"0\" \/><\/p>\n<p>Choice C makes sense because we can easily see that the\u00a0elements of\u00a0<img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/s5_p6_clip_image042.gif\" width=\"16\" height=\"16\" name=\"graphics16\" align=\"ABSMIDDLE\" border=\"0\" \/>\u00a0are more widely dispersed than those of any other data set.<\/p>\n<\/div>\n<\/section>\n<h4>What\u2019s the difference between variance and standard\u00a0deviation?<\/h4>\n<p>The important difference between variance and standard\u00a0deviation arises with respect to units. Suppose we compiled a data\u00a0set about the students in a class, consisting of each student&#8217;s\u00a0height and using inches as our unit. To calculate the mean, we sum\u00a0the heights and divide by the number of students; the unit is the\u00a0same for the elements of the data set and the mean value, it does\u00a0not change.<\/p>\n<p>To compute variance, we introduce a square. This means the unit\u00a0for the variance is\u00a0<img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/s5_p5_html_m56df6197.gif\" width=\"51\" height=\"21\" name=\"graphics23\" align=\"BOTTOM\" border=\"0\" \/>.\u00a0By taking the square root to get the standard deviation, we\u00a0return the units to inches. The standard deviation thus gives us a\u00a0measure of a data set&#8217;s dispersion, one which has the same units\u00a0as the elements of the data set. Variance is the only value\u00a0discussed here that takes an exponent\u2014mean, mode, range,\u00a0median, and standard deviation are all measured in units that are\u00a0not squared.<\/p>\n<p class=\"notebox_text\" align=\"left\">Let <img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/s5_p6_clip_image045.gif\" width=\"109\" height=\"16\" name=\"graphics17\" align=\"ABSMIDDLE\" border=\"0\" \/> be our data set.<\/p>\n<div align=\"left\">\n<ul>\n<li class=\"notebox_text\">mean<\/li>\n<\/ul>\n<\/div>\n<p align=\"center\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/s5_p6_clip_image048.gif\" width=\"204\" height=\"38\" name=\"graphics18\" align=\"BOTTOM\" border=\"0\" \/><\/p>\n<div align=\"left\">\n<ul>\n<li class=\"notebox_text\">variation<\/li>\n<\/ul>\n<\/div>\n<p align=\"center\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/s5_p6_clip_image051.gif\" width=\"256\" height=\"41\" name=\"graphics19\" align=\"BOTTOM\" border=\"0\" \/><\/p>\n<div align=\"left\">\n<ul>\n<li class=\"notebox_text\">standard deviation<\/li>\n<\/ul>\n<\/div>\n<p align=\"center\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/s5_p6_clip_image054.gif\" width=\"273\" height=\"46\" name=\"graphics20\" align=\"BOTTOM\" border=\"0\" \/><\/p>\n<h3>How do we simplify data sets using graphs?<\/h3>\n<p><span style=\"text-decoration: none;\">Another way to simplify the\u00a0information in a data set is to display the information in a\u00a0graph. The <\/span><abbr title=\"data display that allows you to quickly extract pertinent information, such as the median, lower and upper quartiles, and mean\"><span style=\"text-decoration: none;\">box-and-whisker\u00a0plot<\/span><\/abbr> <span style=\"text-decoration: none;\">is\u00a0one commonly used graph. Before we can draw a box-and-whisker\u00a0plot, however, we must introduce some new terms. <\/span><\/p>\n<ul>\n<li>The <abbr title=\"the smallest element in a data set \">minimum<\/abbr> of a data set is the smallest element of that data set.<\/li>\n<li>The <abbr title=\" the largest element of that data set \">maximum<\/abbr> of a data set is the largest element of that data set.<\/li>\n<li>The <abbr title=\"the median of the set of elements greater than the minimum and less than the median in a data set and is the median of the values less than the median. \">lower\u00a0quartile <\/abbr> is the median of the set of the elements that lies\u00a0between the minimum and the median of the entire set.<\/li>\n<li>The <abbr title=\" the median of the set of elements greater than the median in a data set. \">upper\u00a0quartile<\/abbr> is the median of the set of the elements that lies\u00a0between the maximum and the median of the entire set.<\/li>\n<\/ul>\n<p>These values provide a useful way to segregate the elements of\u00a0a data set.<\/p>\n<p>For example, given the data set below:<\/p>\n<p align=\"CENTER\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/s5_p6_html_m4d2c2cf7.gif\" width=\"355\" height=\"27\" name=\"graphics3\" align=\"BOTTOM\" border=\"0\" \/><\/p>\n<p>find the:<\/p>\n<p>(a) median,<\/p>\n<p>(b) minimum and maximum,<\/p>\n<p>(c) lower quartile, and<\/p>\n<p>(d) upper quartile.<\/p>\n<p>The upper and lower quartiles are just the medians of the upper\u00a0and lower halves of the data set. Just as we did when finding the\u00a0median, we can simplify things by arranging the elements of the\u00a0data set in increasing order.<\/p>\n<p align=\"CENTER\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/s5_p7_clip_image006.gif\" width=\"398\" height=\"23\" name=\"graphics4\" align=\"BOTTOM\" border=\"0\" \/><\/p>\n<p>For part (a), the median of this data set is\u00a0<img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/s5_p7_clip_image009.gif\" width=\"89\" height=\"34\" name=\"graphics5\" align=\"ABSMIDDLE\" border=\"0\" \/>.<\/p>\n<p align=\"CENTER\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/Math%20Mod%207.3%20Art%20001.JPG\" alt=\" Data set showing median\" width=\"400\" height=\"67\" name=\"graphics6\" align=\"BOTTOM\" border=\"0\" \/><\/p>\n<p>For part (b), the minimum is 5 and the maximum is 98.<\/p>\n<p>To determine the lower quartile, we must find the median of the\u00a0lower half of the data set, between the median and the minimum.<\/p>\n<p align=\"CENTER\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/Math%20Mod%207.3%20Art%20002.JPG\" alt=\"Data set showing lower quartile\" width=\"400\" height=\"76\" name=\"graphics7\" align=\"BOTTOM\" border=\"0\" \/><\/p>\n<p>For part (c), the lower quartile is\u00a0<img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/s5_p7_clip_image012.gif\" width=\"87\" height=\"34\" name=\"graphics8\" align=\"ABSMIDDLE\" border=\"0\" \/>.<\/p>\n<p>To determine the upper quartile, we must find the median of the\u00a0upper half of the data set, between the median and the maximum.<\/p>\n<p align=\"CENTER\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/Math%20Mod%207.3%20Art%20003.JPG\" width=\"400\" height=\"76\" name=\"graphics9\" align=\"BOTTOM\" border=\"0\" \/><\/p>\n<p>For part (d), the upper quartile is\u00a0<img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/s5_p7_clip_image015.gif\" width=\"102\" height=\"34\" name=\"graphics10\" align=\"ABSMIDDLE\" border=\"0\" \/>.<\/p>\n<h3>Why do we have quartiles?<\/h3>\n<p>Quartiles partition the data set into four equal parts, each of\u00a0which contains 25% of the data in the set. In the example above,\u00a0the data set consists of 16 elements and each of the four\u00a0quartiles contains 4 of the numbers from the original data set.\u00a0The location of the quartiles quickly gives us an idea of the data\u00a0set&#8217;s distribution.<\/p>\n<p>A box-and-whisker plot graphically represents the median, upper\u00a0quartile, lower quartile, and maximum and minimum on a line graph.<\/p>\n<p>We will now draw the box-and-whisker plot associated with the\u00a0data set from the example above,<\/p>\n<p align=\"CENTER\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/s5_p7_html_m4d2c2cf7.gif\" width=\"355\" height=\"27\" name=\"graphics3\" align=\"BOTTOM\" border=\"0\" \/>.<\/p>\n<p>To make a box-and-whisker plot, begin by drawing a number line\u00a0large enough to hold the entire data set. Mark the median value of\u00a0the data set. For this data set, the median is 24.<\/p>\n<p align=\"CENTER\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/Math%20Mod%207.3%20Art%20004.JPG\" alt=\" Box and whisker plot step 1\" width=\"400\" height=\"79\" name=\"graphics4\" align=\"BOTTOM\" border=\"0\" \/><\/p>\n<p>Next, make a mark above the lower and upper quartiles. For this\u00a0set, the lower quartile is 15 and the upper quartile is 37.5.<\/p>\n<p align=\"CENTER\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/Math%20Mod%207.3%20Art%20005.JPG\" alt=\" Box and whisker plot step 2\" width=\"400\" height=\"79\" name=\"graphics5\" align=\"BOTTOM\" border=\"0\" \/><\/p>\n<p>Next, make a box around the upper and lower quartiles and draw\u00a0dots above the extreme minimum and extreme maximum.<\/p>\n<p align=\"CENTER\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/Math%20Mod%207.3%20Art%20006.JPG\" alt=\"Box and whisker plot step 3\" width=\"400\" height=\"79\" name=\"graphics6\" align=\"BOTTOM\" border=\"0\" \/><\/p>\n<p>Finish by adding a line connecting the extreme values. These\u00a0&#8220;whiskers&#8221; added to the box around the quartiles give\u00a0this data display its name.<\/p>\n<p align=\"CENTER\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/Math%20Mod%207.3%20Art%20007.JPG\" alt=\" Box and whisker plot step 4\" width=\"400\" height=\"79\" name=\"graphics7\" align=\"BOTTOM\" border=\"0\" \/><\/p>\n<h3>How do we interpret a box-and-whisker plot?<\/h3>\n<p>The box-and-whisker plot gives us an idea of the way that a\u00a0data set&#8217;s elements are distributed. With a just a quick glance at\u00a0the plot, we can gather a lot of useful information.<\/p>\n<p>The location of the box on the whiskers tells us where the\u00a0majority of the data set&#8217;s elements lie. In our example, the box\u00a0sits far on the left half of the whiskers. We should therefore\u00a0expect most of the elements to be less than the upper quartile\u00a0value.<\/p>\n<p>Within the box, the location of the median value tells us how\u00a0evenly distributed the elements are around the median. In this\u00a0example, the median value is almost exactly in the middle of the\u00a0box, so we would expect that the elements are equally spread out\u00a0around the median. In this case, there should be about as many\u00a0elements between the lower quartile and the median as there are\u00a0between the median and the upper quartile.<\/p>\n<p>The real value of the box-and-whisker plot is that information\u00a0such as that we just deduced about the data distribution holds no\u00a0matter what the size of the data set is. Suppose we had a data set\u00a0that consisted of 10,000 numbers, a set which had a similar\u00a0box-and-whisker plot to the one described above. These two\u00a0statements about the data distribution would still be true for\u00a0this much larger data set, and we can tell that this is so because\u00a0they have similar box-and-whisker plots.<\/p>\n<section class=\"question\">\n<h4>Question<\/h4>\n<div>\n<p>Which box-and-whisker plot below correctly displays the\u00a0following data set?<\/p>\n<p align=\"CENTER\">{2, 5, 16, 12, 10, 18, 14, 12, 15, 3, 4, 7, 6, 11}<\/p>\n<ol>\n<li><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/Math%20Mod%207.3%20Art%20008.JPG\" alt=\" Box-and-whisker plot 1\" width=\"400\" height=\"72\" name=\"graphics3\" align=\"BOTTOM\" border=\"0\" \/><\/li>\n<li><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/Math%20Mod%207.3%20Art%20009.JPG\" alt=\"Box-and-whisker plot 2\" width=\"400\" height=\"72\" name=\"graphics4\" align=\"BOTTOM\" border=\"0\" \/><\/li>\n<li><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/Math%20Mod%207.3%20Art%20010.JPG\" alt=\" Box-and-whisker plot 3\" width=\"400\" height=\"72\" name=\"graphics5\" align=\"BOTTOM\" border=\"0\" \/><\/li>\n<li><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/Math%20Mod%207.3%20Art%20011.JPG\" width=\"400\" height=\"72\" name=\"graphics6\" align=\"BOTTOM\" border=\"0\" \/><\/li>\n<\/ol>\n<\/div>\n<p><a class=\"button button-primary q-answer\"> Reveal Answer <\/a><\/p>\n<div class=\"q-reveal\">\n<p>The correct choice is D. To create a box-and-whisker plot, we\u00a0must first calculate the median. First, rearrange the data set so\u00a0that the elements are in increasing order.<\/p>\n<p align=\"CENTER\">{2, 3, 4, 5, 6, 7, 10, 11, 12, 12, 14, 15, 16,<br \/>\n18}<\/p>\n<p>This data set has an even number of elements, so the median is\u00a0the average of the two central elements:\u00a0<img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/s5_p9_clip_image003.gif\" width=\"98\" height=\"34\" name=\"graphics7\" align=\"ABSMIDDLE\" border=\"0\" \/>.\u00a0To calculate the lower quartile, compute the median of the set\u00a0{2, 3, 4, 5, 6, 7, 10}. The lower quartile is 5. To determine the\u00a0upper quartile, find the median of the set {11, 12, 12, 14, 15,\u00a016, 18}. The upper quartile is 14. Finally, the minimum of the\u00a0data set is 2 and the maximum of the data set is 18. The only\u00a0choice that properly reflects this information is D.<\/p>\n<\/div>\n<\/section>\n<h3>Review of New vocabulary and Concepts<\/h3>\n<ul>\n<li>A <em><strong>data set<\/strong><\/em>,\u00a0also called a <strong><em>data distribution<\/em><\/strong>, is a\u00a0collection of values representing a population. It is usually\u00a0represented as a range of figures or terms enclosed in braces,\u00a0e.g.,\u00a0<img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/s5_p10_clip_image003.gif\" width=\"109\" height=\"16\" name=\"graphics3\" align=\"ABSMIDDLE\" border=\"0\" \/>.\u00a0The <em><strong>mean <\/strong><\/em>of\u00a0a data set is the average value. The mean is defined by the\u00a0formula\u00a0<img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/s5_p10_clip_image006.gif\" width=\"65\" height=\"38\" name=\"graphics4\" align=\"ABSMIDDLE\" border=\"0\" \/>.<\/li>\n<li>The <em><strong>mode\u00a0<\/strong><\/em>of a data set is the value that occurs most often. Remember that\u00a0the mode can be represented by more than one number, because more\u00a0than one element in a set might recur an equal number of times.<\/li>\n<li>The <em><strong>range <\/strong><\/em>is\u00a0the difference between the smallest number in a data set (the\u00a0minimum) and the largest number in that data set (the maximum).<\/li>\n<li>The <em><strong>median\u00a0<\/strong><\/em>of a data set is the number that falls in the middle of the set.<\/li>\n<li>The <em><strong>variance\u00a0<\/strong><\/em>measures the way that a data set&#8217;s elements are dispersed. It is\u00a0defined by the formula\u00a0<img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/s5_p10_clip_image009.gif\" width=\"112\" height=\"38\" name=\"graphics5\" align=\"ABSMIDDLE\" border=\"0\" \/>.<\/li>\n<li>The <strong><em>standard\u00a0deviation <\/em><\/strong>is the square root of the variance. It is\u00a0defined by the formula\u00a0<img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/americanboard.org\/Subjects\/Images\/math\/7\/images\/s5_p10_clip_image012.gif\" width=\"119\" height=\"46\" name=\"graphics6\" align=\"ABSMIDDLE\" border=\"0\" \/>.<\/li>\n<li>The <strong><em>maximum\u00a0<\/em><\/strong>of a data set is the largest element in that data set.<\/li>\n<li>The <strong><em>minimum\u00a0<\/em><\/strong>of a data set is the smallest element in that data set.<\/li>\n<li>The <strong><em>lower quartile\u00a0<\/em><\/strong>of a data set is the median of the set of elements between the\u00a0set&#8217;s minimum and its median. These elements are greater than the\u00a0minimum and less than the median.<\/li>\n<li>The <strong><em>upper quartile\u00a0<\/em><\/strong>of a data set is the median of the set of elements between the\u00a0set&#8217;s maximum and its median. These elements are greater than the\u00a0median and less than the maximum.<\/li>\n<li>A <strong><em>box-and-whisker plot<\/em><\/strong> organizes\u00a0the information in a data set graphically.<\/li>\n<\/ul>\n<\/section>\n<p><!-- CONTENT ENDS HERE --><\/p>\n<div class=\"advance\"><a class=\"button button-primary\" href=\"http:\/\/americanboard.org\/Subjects\/mathematics\/independent-and-dependent-events\">\u2b05 Previous Lesson<\/a>\u00a0<a class=\"button\" href=\"http:\/\/americanboard.org\/Subjects\/mathematics\/probability-statistics-data-analysis\">Workshop Index<\/a>\u00a0<a class=\"button button-primary\" href=\"http:\/\/americanboard.org\/Subjects\/mathematics\/data-displays-normal-distributions-and-lines-of-best-fit\">Next\u00a0Lesson \u27a1<\/a><\/div>\n<p><a class=\"backtotop\" href=\"#title\">Back to Top<\/a><\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>\u2b05 Previous Lesson\u00a0Workshop Index\u00a0Next\u00a0Lesson \u27a1 Beginning Statistics Objective In this lesson, you will study how to compute the mean, mode, median, variance, and standard deviation of a\u00a0distribution of data. We will also see how to determine the minimum, maximum, upper quartile, and lower quartile of\u00a0the data set and use this information to display the data [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-298","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/americanboard.org\/Subjects\/mathematics\/wp-json\/wp\/v2\/pages\/298","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/americanboard.org\/Subjects\/mathematics\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/americanboard.org\/Subjects\/mathematics\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/americanboard.org\/Subjects\/mathematics\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/americanboard.org\/Subjects\/mathematics\/wp-json\/wp\/v2\/comments?post=298"}],"version-history":[{"count":10,"href":"https:\/\/americanboard.org\/Subjects\/mathematics\/wp-json\/wp\/v2\/pages\/298\/revisions"}],"predecessor-version":[{"id":703,"href":"https:\/\/americanboard.org\/Subjects\/mathematics\/wp-json\/wp\/v2\/pages\/298\/revisions\/703"}],"wp:attachment":[{"href":"https:\/\/americanboard.org\/Subjects\/mathematics\/wp-json\/wp\/v2\/media?parent=298"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}