 ## Subtitles section Play video

• Choosing which statistical test to use.

• There are many different tests you can use in statistics.

• Sometimes it can be quite difficult to know which is the correct test to use.

• This video will talk about seven tests you are likely to use.

• involving means proportions and relationships.

• When you are trying to work out which is the most appropriate test

• there are three questions you should ask

• One. What level of measurement was used for the data we are analyzing.

• 2. How many samples do we have?

• 3. What is the purpose of our analysis?

• I will now explain each of these questions

• 1. Data or level of measurement

• Is our data nominal or interval/ratio?

• Nominal data is also called categorical, qualitative

• or nonparametric

• Examples of nominal data are color

• whether parts are defective or not,

• or preferred type of chocolate.

• Nominal summary values are usually stated as frequencies, proportions or

• percentages.

• The tests that involve nominal data are:

• Test for a proportion

• Difference of two proportions

• and chi-squared test for independence

• The other type of data

• is interval/ratio

• also called quantitative

• Examples of interval/ratio data are

• daily sales figures for choconutties

• weight of peanuts or temperature

• the most common summary value for interval/ratio data is a mean.

• Tests that involve interval/ratio data are:

• Test for a mean

• difference of two means - independent samples

• difference of two means - paired

• and regression analysis.

• For more help on levels of measurement see our video:

• "Types of data nominal, ordinal, interval/ratio"

• Ordinal data can be classified with nominal or interval/ratio

• depending on the circumstances.

• 2. Samples

• Next we ask how many samples are involved

• Is there one sample for which we are testing the relevant statistic

• against a hypothesized value

• or are there two samples

• which are being compared with each other

• or

• is the one sample but each observation has a measure or score

• for more than one variable?

• The same sample is measured twice.

• If we wish to compare a proportion or a mean against a given value,

• this will involve one sample.

• If we're comparing two different lots of people or things such as men and women

• or people from two different departments

• then we would have two samples.

• If we have two sets of information on the same people of things

• we would say we have one sample with two variables.

• An example is one set of days and information on how many choconutties

• are sold and what the temperature was.

• Or - one set of people and information on their gender and preferred type of chocolate.

• What is the purpose of the analysis?

• We can be testing against the hypothesized value

• comparing two statistics

• or looking for a relationship.

• Chi-squared test for independence and regression are similar

• in that they are looking at the relationship between two variables

• The difference between them is in the kind of data.

• If you would summarize the data in s table,

• we would use a chi-squared test fo independence

• whereas if you would put it on a scatter plot

• you would use regression analysis.

• Here iss an example for each of these tests.

• They relate back or out other videos teaching about hypothesis testing.

• After each description of the scenario pause the video

• and see if you can identify the correct test before we tell you the answer.

• Helen is still selling choconutties.

• Example one:

• sufficient nuts.

• Helen was concerned whether the quantity of nuts was sufficient in her choconutties.

• She took a sample of twenty packets and found the weight of nuts in

• each packet

• Pause the video

• 1. Data

• The weight was interval/ratio data.

• 2. Samples

• There was just one sample of twenty packets of choconutties.

• 3. Purpose. Helen was comparing against given value

• Thus, the test she needs to use is Test for a mean.

• Example Two

• Prize tickets

• In a promotional campaign twenty percent of all packs of choconutties should

• Helen takes a sample of fifty packets and finds that seven of them

• have winning tickets

• Pause the video

• 1. Data: For each bar we are saying yes or no, only to be lumped whether or not

• there is a ticket.

• This is nominal data from which we get a sample proportion of seven out of fifty

• Or 0.14

• Samples

• There is one sample of fifty packets

• Purpose.

• Helen is comparing the sample value against a given value: twenty percent

• We conclude that the test she needs to use is test for a proportion.

• Example three

• Bar longevity compared with nuttabars.

• Helen thinks her choconutties last longer than the competition, nuttabars.

• She gets 36 people to eat one of each, and records their eating times.

• Pause now

• 1. Data. Helen collects times taken in seconds

• so this is interval/ratio data.

• 2. Samples

• There is one sample of thirty-six people but with two scores for each person

• the time for the choconuttie and the time for the nuttabar.

• 3. Purpose

• She is looking at whether there iss a difference in the amount of time taken

• for each of the bars.

• Thus the test is difference of two means, paired sample.

• Example four

• Defective wrapping from two wrapping machines

• Helen thinks there is a difference in performance between

• the two wrapping machines in her factory. She checks 200 bars from

• one machine and 150 bars from the other.

• For each bar she is seeing if the wrapping is satisfactory or not

• She finds that ten out of two hundred bars from the first machine

• and nine out of 150 bars from the second machine

• Pause the video

• Data. The information for each bar is OK or not ok

• This is nominal data.

• It has been summarized as frequencies.

• 2. Samples there are two independent samples

• one sample from each of the two machines

• 3. Purpose

• Helen is comparing the proportions from the two samples

• We can see that the test is

• difference of two proportions.

• Example five

• Do stickers help sales?

• Helen is exploring whether having free stickers makes a difference to sales.

• She has the sales figures for thirteen days when she did offer free stickers

• and ten days when she did not. Pause and decide on the test

• Data. For each day Helen has a number or value corresponding to the sales for that day

• This is interval/ratio data

• It is summarized as a mean member of sales.

• 2. Samples

• There are two samples one sample for days with stickers

• and one sample for days without.

• 3. Purpose

• Helen is comparing the average sales figures for the two treatments

• we conclude that the test to use is...

• Difference of two means independent samples

• Example six

• Are sales affected by temperature?

• Helen wants to see if there is a relationship between the daily

• temperature and sales of choconutties.

• She has data on sales and temperature

• for thirty weekdays of sales

• Pause!

• Data. Sales and temperature at both interval variables

• Samples

• There is one sample of thirty days with two measures or scores for each day.

• Purpose.

• Helen is interested in the relationship between sales and temperature

• This leads us to decide that the test is regression.

• Example seven

• Men and women and chocolate preference

• Helen is thinking of selling dark chocolate, milk chocolate and white chocolate

• choconutties.

• She thinks that men and women might have different preferences with regard to type.

• She collects data from fifty customers, noting down if they are men or women

• and asking them which variety they prefer.

• Pause the video and decide.

• Data. Helen records the type of chocolate and sex of person.

• These are both nominal variables.