Posted: June 11th, 2022
only part 3
MATH20000D Group Case 1
Case Description: You work at a statistical company.
You are required to analyze the data, then summarize your findings in a report. In this scenario, you are a team of professional statisticians, not students. Do not quote the questions; avoid saying “don’t know”, “probably”, “might have” etc. Thereport should be clear and concise to be read by people who may not be familiar with statistical terms.
Form your groups up to 5 people by week 3. Otherwise, you will be assigned to a group randomly.
The report is to be completed in Microsoft Word, including tables and graphs copied from Excel (actual tables and graphs, not screenshots). Watch my videos on how to make good graphs. Answer all questions using short, easy to read sentences.
Remove the instructions but keep the section titles and numbering.
Collecting data: Pick a city; year 2021 or 2022; use 3 consecutive months starting with your group # (group1 uses Jan-Feb-Mar, group2 Feb-Mar-Apr, etc.).
Some options to collect data from:
Search for daily data and click “Download data” in CSV format:
Select station, year, and pollutant. Choose the CSV output type.
Select station and show year, get readings. Copy and paste the whole page to Excel.
Select Climate Daily/Forecast/Sun and click download.
Select a year, then Integrated Data.
5) You can use any other websites with raw daily data related to weather/environment.
Clean up your data by keeping only 3 months (keep the dates column) and removing empty columns. Save in Excel, name the tab “Raw Data”. Your final Excel file will also have “Quantitative” and “Qualitative” tabs, with the most important charts copied to the Word report.
· Course and Group: Math20000d Case 1 report by Group #
· Project Name: “A short study of YourData in YourCity inMonth1−Month3 of YourYear.”
· Due Date:
· Date Submitted:
· Group Member Names and ID Numbers:
· Supervisor: Dr. Natasha Pshenitsyna
(half a page)
Give 2 examples of weather/environment data (not necessarily from the websites) for each of the levels of measurement: nominal, ordinal, interval, ratio (8 examples total). For each example, specify the data type (qualitative, quantitative discrete, quantitative continuous). Using your examples, explain the difference between various levels of measurement.
For each example, suggest the best measure of centre to summarize the data, and a type of graph to visualize it. Explain your choices.
Visualization of quantitative data.
(half a page)
Choose a variable of the ratio or interval level.
1.1 Create a frequency distribution (pivot table) for the variable, and the corresponding histogram with all necessary labels. Use a reasonable number of classes with round numbers for limits and continuous intervals. Use relative frequencies (%). Filter out blank/irrelevant values in the pivot table. Save in Excel, name the tab “Quantitative”. Copy the pivot table and the graph to the Word report.
1.2 Describe the shape of the histogram (mound shaped, bi-modal, multi-modal or almost uniform), the skewness (left or right, or symmetric), high or low outliers.
1.3 Explain in simple terms why the variable might be distributed this way, including the peak(s), symmetry or skewness, low or high or no outliers.
Example: Shoe sizes have a bi-modal distribution, with the two peaks corresponding to the most common sizes among surveyed females and males: size 8 for 45% of females; size 11 for 36% of males. The longer right tail of the distribution represents people with shoe sizes above typical. There are no low and no high outliers in the sample (no extremely small or extremely large shoe sizes), as shoe sizes have a limited range.
Analysis of quantitative data.
(half a page)
1.4 Compute all possible measures of centre; several measures of spread; several measures of shape for the same quantitative variable you visualized in the previous part. Save in the “Quantitative” tab; paste as a short table in the report.
1.5 Choose the most useful measure(s) of centre in this case and explain what it means (include the value with units and the context).
1.6 Choose the most useful measure(s) of spread in this case and explain what it means (include the value with units and the context).
1.7 Choose the most useful measure(s) of shape in this case and explain what it means (include the value with units and the context).
Visualization and analysis of qualitative data.
(half a page)
Choose another variable, of the ordinal or nominal level. If there are no qualitative variables in your data (check carefully!), convert one of the variables to ordinal level. Note that variables with units (temperature, wind speed) are better kept numeric; instead use unitless variables (low-medium-high for air quality or UV index etc.)
1.8 Create a frequency distribution (pivot table) for this variable and a graph with all necessary labels. Use relative frequencies (%). Filter out blank values in the pivot table. Save in Excel, name the tab “Qualitative”. Copy the pivot table and the graph to the Word report.
1.9 Find all possible measures of centre for the variable. Explain what each of them represents and how they can be useful.
1.10 Write 2 observations about the distribution of this variable.
Justifying the study
(half a page)
Give three examples of organizations, groups, or individuals who might use the collected data; and explain how it would be helpful to them.
Give three examples of weather or environment related data (not from your data file) that could also be useful for those groups.
Page 2 of 2
Place an order in 3 easy steps. Takes less than 5 mins.