## D1. Data Literacy

### Specific Expectations

#### Data Collection and Organization

D1.1

explain the importance of various sampling techniques for collecting a sample of data that is representative of a population

**sampling techniques:**

- Sampling is gathering information by using a subset of a population. It is more efficient and practical than trying to get data from every item in a population. It is more cost effective too.
- Simple random sampling is a method used to obtain a subset such that each subject in the population has an equal chance of being selected (e.g., randomly selecting 10% of the population using a random number generator).
- Stratified random sampling involves partitioning the population into strata and then taking a random sample from each. For example, a school population could be divided into two strata: one with students who take a bus to school and the other with those who don’t take a bus. Then a survey could be given to 10% of the population randomly selected from each of these strata.
- Systematic random sampling is used when the subjects from a population are selected through a systematic approach that has been randomly determined. For example, a sample could be determined from an alphabetized list of names, using a starting name and count (e.g., every fourth name) that are randomly selected.
- Data from a sample is used to make judgements and predictions about a population.

*Note*

- A census is an attempt to collect data from an entire population.

As students work with larger populations, it is important for them to know when it is appropriate to do a census and when a sample is enough. Provide students with various scenarios, and have them discuss potential samples. For example:

- If the marketing department of a large toy company wants to find out whether the company’s new toy is successful, whom should they survey?
- A new granola bar is being developed, and the company wants to run a taste test to learn whether people like the new flavour. Who should be included among the sample taste-testers?
- A local sports facility conducts a survey of parents in the community asking which sports it should offer in the summer. Is this a representative sample of the population that might use the facility? How would you find out?
- A local Indigenous company wants to expand their all-natural soap products and find out whether people like the new fragrances and product name ideas. Whom should they survey? Why?

D1.2

collect data, using appropriate sampling techniques as needed, to answer questions of interest about a population, and organize the data in relative-frequency tables

**questions of interest:**- What actions can we take to help the environment?

**sampling techniques:****relative-frequency table:**- the number of responses for each category is expressed as a proportion and a percentage of the total number of responses (50):

- The type and amount of data to be collected is based on the question of interest. Data can be either qualitative (e.g., colour, type of pet) or quantitative (e.g., number of pets, height).
- Depending on the question of interest, the data may need to be collected from a primary or a secondary source.
- Depending on the question of interest, a random sample of the population may need to be taken. Types of sampling methods include simple random sampling, stratified random sampling, and systematic random sampling.
- A relative frequency table is an extension of a frequency table and shows each category expressed as a proportion of the total frequencies, represented using fractions, decimals, or percentages. The sum of the relative frequencies is 1 or 100%.

*Note*

- Every subject in the sample must be collected in the same manner in order for the data to be representative of the population.

As a class, agree on a question of interest. Then organize the class in small groups and assign each group a different sample size (e.g., 100% of the class, 50% of the class, 25% of the class, 20% of the class, or 5% of the class). Each group must decide on a sampling technique and then collect and organize their data.

As a follow-up, have students compare the different data sets and discuss what they notice about the size of the samples and the sampling techniques used. Have each group discuss what they found interesting or challenging about the technique of sampling that they used, and which techniques were the most appropriate to use. This will build students’ understanding of sampling techniques.

After students have collected relevant data on a question of interest, have them organize it in a relative-frequency table. Demonstrate how the relative frequency can be determined by calculating the fractional amount and percentage that each category represents.

#### Data Visualization

D1.3

select from among a variety of graphs, including stacked-bar graphs, the type of graph best suited to represent various sets of data; display the data in the graphs with proper sources, titles, and labels, and appropriate scales; and justify their choice of graphs

**choice of graphs:**- pictograph
- line plot
- bar graph
- multiple-bar graph
- stacked-bar graph

**bar graph with****relative frequency****:**

- Relative frequencies can be used to compare data sets that are of different sizes.
- Stacked-bar graphs can be created in more than one way to show different comparisons, including with horizontal and vertical bars.
- Stacked-bar graphs display the data values proportionally. Stacked-bar graphs can be used to display percent, or relative frequency. Each bar in the graph represents a whole, and each of the segments in a bar represents a different category. Different colours are used within each bar to easily distinguish between categories.
- The source, titles, labels, and scales provide important information about data in a graph or table:
- The source indicates where the data was collected.
- The title introduces the data contained in the graph.
- Labels on the axes of a graph describe what is being measured (the variable). A key on a stacked-bar graph indicates what each portion of the bar represents.
- Scales are indicated on the axis of bar graphs , showing frequencies , and the key of pictographs.
- The scale for relative frequencies is indicated using fractions, decimals, or percents.

*Note*

- The type of scale chosen depends on whether frequencies or relative frequencies will be displayed on the graphs.
- Depending on the scale that is chosen, it may be necessary to estimate the length of the bars or the portions of the bars on a stacked-bar graph.

Have students construct three different bar graphs using the same data set. For example, the Grade 5 students at School B decided to install a rain barrel that would provide water for the new butterfly garden. The data in the chart shows the amount of rain they collected during April.

Ask students to graph the data for Week 1 using one-to-one, two-to-one, and five-to-one correspondence. Ask them to compare the shape of the data among the three graphs and describe what they notice. Ask them to explain which of the three bar graphs best represents the data and why.

Finally, have students prepare a bar graph for Week 1 using relative frequencies. Ask them to describe how this graph is the same as and different from the other bar graphs they made. Creating bar graphs using relative frequencies will support students in understanding graphs that use percentages to describe data.

**determining relative frequencies:**

**graphing relative frequencies:**

Stacked-bar graphs in combination with multiple-bar graphs provide different perspectives for comparing data sets. Have students create a multiple-bar graph showing the rain collected in the rain barrel during the month of April.

Next, ask students to create a stacked-bar graph showing the amount of water collected in the rain barrel during April.

Support students in making connections between the two graphs by having them describe how the bars in the multiple-bar graph correspond to those in the stacked-bar graph.

Ask students what information they can gather from this graph that was not as obvious in the multiple-bar graph, and vice versa. Discuss which of the two graphs is a better representation for this data, asking students to explain their thinking.

Ask students to create a relative-frequency table and stacked-bar graph for a previously collected data set.

Have students create appropriate graphs in various contexts throughout the year, including cross-curricular applications.

D1.4

create an infographic about a data set, representing the data in appropriate ways, including in relative-frequency tables and stacked-bar graphs, and incorporating any other relevant information that helps to tell a story about the data

**infographic on the topic of “Grade 5 Students take Environmental Action!”:**

- Infographics are used in real life to share data and information on a topic in a concise and appealing way.
- Infographics contain different representations, such as tables, plots, and graphs, with minimal text.
- Information to be included in an infographic needs to be carefully considered so that it is clear, concise, and connected.
- Infographics tell a story about the data with a specific audience in mind. When creating infographics, students need to create a narrative about the data for that audience.

*Note*

- Creating infographics has applications in other subject areas, such as communicating key findings and messages in STEM projects.

To deepen their understanding of infographics and their purpose, have students examine the features and messages of an infographic, such as “Grade 5 Students Take Environmental Action!”, which is found in the examples for** **D1.4. Ask questions such as:

- What audience do you think the infographic was intended for?
- What messages do you think the author was trying to share?
- What data visualizations has the author used? Why do you think they were chosen?

Have students collect infographics. Then, as a class, make a list of features they notice in the infographics. Discuss how these features can change depending on the audience and the story that the author is trying to tell about the data.

For a question of interest that they have collected data for, have students consider what they would like to share about the results and how they would like to share it. For example, students could use infographics to share information in the school newsletter. Ask them to identify their audience, what message they want to get across, what data visualization techniques they will use, and any other information that will help them to share their message. Have them share their ideas with a peer to check that their message is coming through.

#### Data Analysis

**determining the mean, median, and mode for a given data set:**- donations given: $25.50, $32.50, $25.50, $45.00, $34.75, $28.25, $15.25, $25.00, $30.00, $27.25

**mean: The average donation is $28.90.**- sum of the values divided by the number of values:
- $15.25 + $25.00 + $25.50 + $25.50 + $27.25 + $28.25 + $30.00 + $32.50 + $34.75 + $45.00 = $289.00
- $289 ÷ 10 = $28.90

- sum of the values divided by the number of values:
**median: The median of the donations is $27.75.**- Step 1. Data is ordered from least to greatest.
- $$

\sf \small $15.25, $25.00, $25.50, $25.50,\enclose{circle}[mathcolor=Cyan]{\color{black}$27.25, $28.25, }$30.00, $32.50, $34.75, $45.00

$$

- $$
- Step 2. Since there are two values in the middle of the list, the median is the mean of these two values:
- ($27.25 + $28.25) ÷ 2 = $27.75
- Half of the donations are greater than $27.75.
- Half of the donations are less than $27.75.

- ($27.25 + $28.25) ÷ 2 = $27.75

- Step 1. Data is ordered from least to greatest.
**mode: $25.50, since it is the value that appears the most in the list:**- donations given:
*$25.50*, $32.50,*$25.50*, $45.00, $34.75, $28.25, $15.25, $25.00, $30.00, $27.25

- donations given:

- The mean, median, and mode can be determined for quantitative data. Only the mode can be determined for qualitative data.
- A variable can have one mode, multiple modes, or no modes.
- The use of the mean, median, or mode to make an informed decision is relative to the context.

*Note*

- The mean, median, and mode are the three measures of central tendency.

When possible, have students determine the mean, median, and mode of the same data set. Provide students with data sets where the sum of the data values is a three-digit number that when divided will result in a whole number, a decimal tenth, or a decimal hundredth. For example, have students determine the mean, median, and mode for the data displayed in the stem-and-leaf plot below, showing the number of minutes that the weeding team logged during the first week of May.

Have students determine the mean, median, and mode for previously collected data, from a variety of sources, including cross-curricular applications, such as science experiments.

D1.6

analyse different sets of data presented in various ways, including in stacked-bar graphs and in misleading graphs, by asking and answering questions about the data, challenging preconceived notions, and drawing conclusions, then make convincing arguments and informed decisions

**data presented in various ways:****questions that require reading and interpreting data from a graph or table:**- What type of action do most students want to take to help the environment?
- How many students in each grade requested fruit as a snack?

**questions that require finding data from a graph or table and using it in a calculation:**- What percentage of students weeded the garden for more than 15 minutes?
- What was the average weekly rainfall in April?

**question that requires using data from a graph to make an inference or prediction:**- Why do you think that the rainwater collected was greatest on Mondays?
- When do you think most of the weeding team worked in the garden? Explain your thinking.

**misleading graph (with values that do not start at zero and therefore appear exaggerated):**

- Different representations are used for different purposes to convey different types of information.
- Stacked-bar graphs present information in a way that allows the reader to compare multiple data sets proportionally.
- Sometimes graphs misrepresent data or show it inappropriately, which could influence the conclusions that we make. Therefore, it is important to always interpret presented data with a critical eye.
- Data presented in tables, plots, and graphs can be used to ask and answer questions, draw conclusions, and make convincing arguments and informed decisions.
- Sometimes presented data challenges current thinking and leads to new and different conclusions and decisions.
- Questions of interest are intended to be answered through the analysis of the representations. Sometimes the analysis raises more questions that require further collection, representation, and analysis of data.

*Note*

- There are three levels of graph comprehension that students should learn about and practise:
- Level 1: information is read directly from the graph and no interpretation is required.
- Level 2: information is read and used to compare (e.g., greatest, least) or perform operations (e.g., addition, subtraction).
- Level 3: information is read and used to make inferences about the data using background knowledge of the topic.

- Working with misleading graphs supports students in analysing their own graphs for accuracy.

Provide students with a bar graph that shows only a portion of the bars. Ask them to recreate the bar graph with the horizontal axis starting at zero. Have them compare the two graphs and describe what made the original graph misleading or deceptive.

Provide students with a frequency table, a stacked-bar graph, and a multiple-bar graph displaying the same information. Have them create three questions about the data and ask a classmate to answer them. Have them share questions and discuss possible answers. Creating opportunities for discussion about data can lead to additional questions.

Throughout the year, have students collect representations of data about real-life topics that are of interest to them. Model asking questions using the three types of questions outlined in the examples for D1.6, and have students pose and answer their own questions that require thinking critically about the data.