Skip to main content

More Statistics



If you have any test reviews, homeworks, guides, anything school related that you think can be posted on this website, reach out to me at makingschooleasier@gmail.com  


Symbols:
 
= sum
 
Population - All of the items we are interested in. (Can be finite, such as passengers on a plane or infinite, such as all Cokes bottled in an ongoing process.)  Population symbols are commonly Greek letters. (But not always)
 

N = size
µ = mean
σ2 = variance
σ = standard deviation

Sample - A subset of the population that we will actually be analyzing. Sample symbols are commonly Roman letters. (But not always)


n = size
x̄  = mean
s2 = variance
s = standard deviation

The Three Main Measurements
 
Central Tendency:
 
Knowing how and when to use a measure of central tendency is a matter of understanding what it is you're trying to understand about the data.  Is it of a set of unusual circumstances? Is it about compounded growth? The answers to these questions will make a difference as to what measure of central tendency you'll use.
 
The Six Measures of Central Tendency
 
Statistic
Formula
Excel Command (just in case)
Mean
x̄ = ∑xi
       ------
     n

(sample mean = the sum of the values divided by the number of values)
= AVERAGE(Data)
MedianThe middle value in a sorted array of values=MEDIAN(Data)
ModeThe most frequently occurring value=MODE(Data)
Midrange
xmin + xmax/2  

(The sum of the largest and smallest values divided by 2)
=*.5(MIN(Data)+MAX(Data))
Geometric Meann√(x1)(x2)(x3)…

The product of all of the values rooted by the number of the values. ("Square" roots are by 2. "Cubed" roots are by 3. In this case, it's the nth root where n is the number of values in the set)
=GEOMEAN(Data)
Trimmed MeanSame as the mean, except omitting the highest and lowest k% of the data. (e.g. 5%)TRIMMEAN(Data, Percent)

 
 
Pros and Cons of each method
 
StatisticProCon
MeanFamiliar and it uses all the information in the sampleInfluenced by extreme values
MedianRobust when extreme values existIgnores extremes and can be affected by gaps in the data
ModeUseful for attribute data or discrete data with a small rangeCould be more than one mode and it's not helpful for continuous data
MidrangeEasy to understand and calculateInfluenced by extreme values and ignores most data values
Geometric MeanUseful for growth rates and mitigates high extremesLess familiar. Requires that the data are all positive values.
Trimmed MeanMitigates the effect of extreme valuesExcludes some data values that could be relevant.

 
Skew's Effect on the Mean and Median
 
In a symmetric distribution, these two central tendency measures are the same.   In a skewed distribution, they'll differ.
 
In a left-skewed distribution, in which the outliers are extremely low values, the mean will be less than the median.  (The extremely low outliers are pulling the mean downward.)
 
In a right-skewed distribution, in which the outliers are extremely high values, the mean will be greater than the median. (The extremely high outliers are pulling the mean upward.)
 
In all distributions, the mode will be in the modal class (fr. Chapter 3).  That is, the mode(s) will be located in the peaks in a histogram (or bar chart), because those are the most frequently occurring values.
 
Measures of Dispersion
Dispersion, or how values are varied and spread out in a distribution, will tell you a lot about the data set.  Measures of dispersion are also helpful in telling us a lot about a particular value in a set.
 
Five Measures of Dispersion of a Sample
 
Statistic
Formula
Excel Command
Range (R)
(xmax - xmin)
(the maximum value in the set minus the minimum value)
=MAX(Data)-MIN(Data)
Sample Variance (s2)

n (xi - x̄)2
                                            i=1
--------------
n-1

First, subtract the known sample mean (x̄) from each individual value (x) and square the result.  Then add up all of the [squared] differences.

Divide that number by n-1
=VAR(Data)
Sample Standard Deviation (s)
n (xi - x̄)2
                                               i=1
--------------
n-1

Just like in Sample Variance, first, subtract the known sample mean (x̄) from each individual value (x) and square the result.  Then add up all of the [squared] differences.

Divide that number by n-1

But now, also take the square root of the result.
=STDEV(Data)
Coefficient of Variation (CV)
100 * s/

100 times the sample standard deviation divided by the sample mean.
None
Mean Absolute Deviation (MAD)
n |xi - x̄|
                                          i=1
--------------
n

Just like sample variance, but without needing to square the result.  The absolute value function will make all differences positive, so there won't be any cancellation of values when the sum is taken.
=AVEDEV(Data)

 
Pros and Cons of Various Measures of Dispersion
 
StatisticProCon
RangeEasy to calculate. Easy to interpretSensitive to extreme data values
VariancePlays a key role in mathematical statisticsLess intuitive meaning
Standard DeviationMost commonly used measure.
Expressed in same units as the data. ($, grams, etc)
Less intuitive meaning
Coefficient of VariationExpresses relative variation in percent so you can compare data sets with different units of measurementRequires nonnegative data
Mean Absolute DeviationEasy to understand.Lacks "nice" theoretical properties

 
There's more in this chapter that he listed.  I'm just out of time to input it and get it to you all in a period that would be useful to you.  
 
You can cover the following in the Power Point Slides (online learning center):
 
Chebyshev's Theorem and the Empirical Rule (Said Empirical Rule was more important. They're both theories on how to ID outliers.)
 
Defining a Standardized Variable - Z scores.  How to calculate
 
Percentiles, Quartiles and Box-and-Whisker Plots (including how to make and interpret Box-and-Whisker plots).  Fences, unusual data values and
midranges


If you have any test reviews, homeworks, guides, anything school related that you think can be posted on this website, reach out to me at makingschooleasier@gmail.com  

Popular posts from this blog

Setting The Stage For Learning About The Earth

If you have any test reviews, homeworks, guides, anything school related that you think can be posted on this website, reach out to me at makingschooleasier@gmail.com   (These Answers Should Be Used as a Basis For Yours) Exercise 1.1 Submergence Rate Along the Maine Coast The rate of submergence is the total change in elevation of the pier 2 meters divided by the total amount of time involved 300 years and is therefore .67 cm/yr Exercise 1.4  Sources of Heat for Earth Processes A. The sand should be hot since the sun has been heating up the sand throughout the day. i. When you dig your feet into the sand you should feel cooler sand since the sun's penetration into the earth is limited. ii. This suggests that the Sun can only penetrate into the Earth up until a certain depth. iii.Based on this conclusion, one can assume that the Sun is not responsible for the Earth's internal heat since, we have heat hundreds of kilometers within the Earth and this can not be exp

The Romantics: John Keats and Samuel T. Coleridge

If you have any test reviews, homeworks, guides, anything school related that you think can be posted on this website, reach out to me at makingschooleasier@gmail.com   PART OF THIS ESSAY HAS BEEN  OMITTED  FOR FULL ESSAY COMMENT,EMAIL, LIKE, FOLLOW US                                    The Romantics: John Keats and Samuel T. Coleridge         The Romantic Period in England had six major poets, William Wordsworth, Lord Byron, Percy Shelley, William Blake, John Keats, and Samuel Coleridge. For the purpose of this essay, the focus will only be on Keats and Coleridge. Although they were contemporaries, they each have very different styles of writing as is evident in their poetry. In “This Lime Tree Bower My Prison” an exemplary example of a conversation poem, the reader is able to see Coleridge’s thought process of how he realizes nature is everywhere around oneself, as long as all “facult[ies] of sense and…the heart [are] awake to Love and Beauty”.

O captain my captain and do not go gentle into that good night

If you have any test reviews, homeworks, guides, anything school related that you think can be posted on this website, reach out to me at makingschooleasier@gmail.com   In Walt Whitman’s “O Captain! My Captain!” and in Dylan Thomas’ “Do Not Go Gentle Into That Good Night”, the reader is presented with two venerable characters of different backgrounds; both which have deep admiration for the poem’s character. With the authors use of diction, figurative language and tone, the reader is able to see just how much some people have an effect on others and what their death brings upon the author and the reader’s mind. In Whitman’s poem, the reader is able to see the heavy use of metaphors throughout the poem.  Whitman’s entire poem is a metaphor. “Captain” is the metaphor for Abraham Lincoln, but on a first reading or without the footnote that is provided, this poem would be very ambiguous. The author’s tone throughout is very prideful and full of admiration towards the President. He