353 - Topic 01 - Basics

378 days ago by Professor353

John Travis

Mississippi College

MATH 353 - Introduction to Mathematical Probability and Statistics

Textbook:  Tanis and Hogg, A Brief Course in Mathematical Statistics

 

Basic Concepts

Sets and Probability

Probabilites in general deal with measuring expectation.  Later in the course, one will discover natural ways for quantifying these measurements but for now experimenting may help us understand probability a bit better.

Given a set $S$, the corresponding Power Set $P(S)$ is the collection of all subsets of $S$.  In the experiment below, the given set $S$ consists of the elements $a, b$ and $c$.  The cardinality of a set is the number of elements in that set so for this example, the cardinality should be 8.

%hide %auto S = Set(['a','b','c']) # Creating the set S using its elements print("Given the sample space") print(S) Power_Set = Set(S.subsets()) print("\nThe set P(S) of all subsets is given by") print(Power_Set) print("\nCardinality of P(S) is") Power_Set.cardinality() 
       
Given the sample space
{'a', 'c', 'b'}

The set P(S) of all subsets is given by
{{'a', 'c', 'b'}, {'b'}, {'a'}, {}, {'c', 'b'}, {'a', 'c'}, {'c'}, {'a',
'b'}}

Cardinality of P(S) is
8
Given the sample space
{'a', 'c', 'b'}

The set P(S) of all subsets is given by
{{'a', 'c', 'b'}, {'b'}, {'a'}, {}, {'c', 'b'}, {'a', 'c'}, {'c'}, {'a', 'b'}}

Cardinality of P(S) is
8

Above, we saw that three elements give rise to a power set consisting of 8 elements.  Experiment with the number of elements in $S$ and make a conjecture regarding the number of elements in $P(S)$ for various sized sets $S$. 

%hide %auto # Using the ideas from below to make the input of the elements interactive print html('<p><p>Enter any desired elements for your set separating each by a comma. Click on "Update" twice when you are done.') @interact def _(Elements=('a,b'),auto_update=False): # Creating the set S using its elements S = Set(Elements.replace(',',' ').split()) print html("<p>Given the sample space"),S Power_Set = S.subsets() print("\nThe set P(S) of all subsets is given by") show(list(Power_Set)) print "\nCardinality of S) is",S.cardinality() print "\nCardinality of P(S) is",Power_Set.cardinality() 
       

Enter any desired elements for your set separating each by a comma. Click on "Update" twice when you are done.

Elements 

Click to the left again to hide and once more to show the dynamic interactive window

For other situations, you might have a number of sets who perhaps share elements.  A nice visual way to organize the elements is using a Venn Diagram.  Below is a Venn diagram application when data is divided between three sets.  Play with the elements in each set and see how the resulting Venn Diagram changes.

%hide %auto # This part was copied verbatim from http://wiki.sagemath.org/interact/misc def f(s, braces=True): t = ', '.join(sorted(list(s))) if braces: return '{' + t + '}' return t def g(s): return set(str(s).replace(',',' ').split()) @interact def _(X='1,2,3,a', Y='2,a,3,4,apple', Z='a,b,10,apple'): S = [g(X), g(Y), g(Z)] X,Y,Z = S XY = X & Y XZ = X & Z YZ = Y & Z XYZ = XY & Z html('<center>') html("$X \cap Y$ = %s"%f(XY)) html("$X \cap Z$ = %s"%f(XZ)) html("$Y \cap Z$ = %s"%f(YZ)) html("$X \cap Y \cap Z$ = %s"%f(XYZ)) html('</center>') centers = [(cos(n*2*pi/3), sin(n*2*pi/3)) for n in [0,1,2]] scale = 1.7 clr = ['yellow', 'blue', 'green'] G = Graphics() for i in range(len(S)): G += circle(centers[i], scale, rgbcolor=clr[i], fill=True, alpha=0.3) for i in range(len(S)): G += circle(centers[i], scale, rgbcolor='black') # Plot what is in one but neither other for i in range(len(S)): Z = set(S[i]) for j in range(1,len(S)): Z = Z.difference(S[(i+j)%3]) G += text(f(Z,braces=False), (1.5*centers[i][0],1.7*centers[i][1]), rgbcolor='black') # Plot pairs of intersections for i in range(len(S)): Z = (set(S[i]) & S[(i+1)%3]) - set(XYZ) C = (1.3*cos(i*2*pi/3 + pi/3), 1.3*sin(i*2*pi/3 + pi/3)) G += text(f(Z,braces=False), C, rgbcolor='black') # Plot intersection of all three G += text(f(XYZ,braces=False), (0,0), rgbcolor='black') # Show it G.show(aspect_ratio=1, axes=False) 
       

Click to the left again to hide and once more to show the dynamic interactive window

Probability and Relative Frequency

Sometimes we can logically determine what the probability for a given outcome should be.  At other times, it may be more difficult or impossible to deduce such values.  If so, we can often determine an "empirical probability" by using the relative frequency from an experiment.

Coins and Dice

For the simplest example, let's flip some coins.  Notice as you change the number of rolls how the relative frequency constantly changes.  You will likely even get a different result as you reuse the same number of rolls again and again.

%hide %auto # Now, let's flip several coins. # We first set up the set of possible outcomes (the sample space) for a coin coin = ["Heads", "Tails"] @interact def _(num_rolls = slider([5..5000],label="Number of Flips")): rolls = [choice(coin) for roll in range(num_rolls)] show(rolls) freq = [0,0] for outcome in rolls: if (outcome=='Tails'): freq[0] = freq[0]+1 else: freq[1] = freq[1]+1 # freq = [rolls.count(outcome) for outcome in set(rolls)] # something was not counting correctly using this print("\nThe frequency of tails = "+ str(freq[0]))+" and heads = "+ str(freq[1])+"." rel = [freq[0]/num_rolls,freq[1]/num_rolls] print("\nThe relative frequencies for Tails and Heads:"+str(rel)) show(bar_chart(freq,axes=False,ymin=0)) # A histogram of the results 
       
Number of Flips 

Click to the left again to hide and once more to show the dynamic interactive window

Notice, since these are random experiments where the likelihood of a given outcome on each trial should not be dependent upon previous trials (the word to use is "independent"), then you could possibly get all of one outcome of another.  However, you wouldn't "expect" that possibility for larger values of num_rolls.

Tasks for you:

  • With both coins and dice (below), experiment with several small values for n (say around 10) to see that the results vary significantly from trial to trial.
  • Experiment again with several medium values for n (say around 50) to see that the results still vary but not quite so dramatically.
  • Experiment again with several large values for n (say from 500 up to 5000) to see that the results vary very little.

Make a conjecture regarding the precise value for the actual probability of a each outcome of a coin and again for each outcome of a die.  Write a paragraph indicating how the results from your experiment corroborate (provide evidence for justifying) your conjecture.

 

%hide %auto # Now, let's roll some dice. # We first set up the set of possible outcomes (the sample space) for a single die @interact def _(num_rolls = slider([20..5000],label='Number of rolls'),num_sides = slider(4,20,1,6,label='Number of sides')): die = list((1..num_sides)) rolls = [choice(die) for roll in range(num_rolls)] show(rolls) freq = [rolls.count(outcome) for outcome in set(die)] # count the numbers for each outcome print 'The frequencies of each outcome is '+str(freq) print 'The relative frequencies of each outcome:' rel_freq = [freq[outcome-1]/num_rolls for outcome in set(die)] # make frequencies relative print rel_freq fs = [] for f in rel_freq: fs.append(f.n(digits=4)) print fs show(bar_chart(freq,axes=False,ymin=0)) # A histogram of the results 
       
Number of rolls 
Number of sides 

Click to the left again to hide and once more to show the dynamic interactive window

%hide %auto # Now, let's roll two dice at a time. # We first set up the set of possible outcomes (the sample space) for a single die @interact def _(num_rolls = slider([20..5000],label='Number of rolls'),num_sides = slider(4,20,1,6,label='Number of sides')): die = list((1..num_sides)) dice = list((2..num_sides*2)) rolls = [(choice(die),choice(die)) for roll in range(num_rolls)] sums = [sum(rolls[roll]) for roll in range(num_rolls)] show(rolls) freq = [sums.count(outcome) for outcome in set(dice)] # count the numbers for each outcome print 'The frequencies of each outcome is '+str(freq) print 'The relative frequencies of each outcome:' rel_freq = [freq[outcome-2]/num_rolls for outcome in set(dice)] # make frequencies relative print rel_freq show(bar_chart(freq,axes=False,ymin=0)) # A histogram of the results print "Relative Frequence of ",dice[0]," is about ",rel_freq[0].n(digits=4) print "Relative Frequence of ",dice[num_sides-1]," is about ",rel_freq[num_sides-1].n(digits=4) 
       
Number of rolls 
Number of sides 

Click to the left again to hide and once more to show the dynamic interactive window