D1. Data Literacy
Specific Expectations
Data Collection and Organization
D1.1
identify situations involving one-variable data and situations involving two-variable data, and explain when each type of data is needed
- situations involving one variable:
- favourite movie
- number of cousins
- amount of rainfall
- length of arm span
- situations involving two variables:
- looking at change in amount of rainfall over the years
- determining if there is a relationship between a person’s arm span and their height
- determining if there is a relationship between time and distance travelled
- A variable is any attribute, number, or quantity that can be measured or counted.
- One-variable data refers to one data set from a sample or a population that can be either qualitative or quantitative. Situations involve representing and analysing the data based on that one variable to answer a question like “What is the average height of all the students in the class?”
- Two-variable data refers to two data sets from the same sample or population that can be either qualitative or quantitative. Situations involve representing and analysing the data based on two variables to answer questions like “Is there a relationship between a person’s height and the length of their arm span?”
Provide students with a bar graph, a histogram, a broken-line graph, a circle graph, and a scatter plot. Have them identify the single variables on the bar graph, histogram, and circle graph, and identify the two variables on the broken-line graph and scatter plot.
D1.2
collect continuous data to answer questions of interest involving two variables, and organize the data sets as appropriate in a table of values
- questions of interest involving two variables:
- Is there a relationship between the width (cm) and height (cm) of a tv screen?
- Is there a relationship between the diameter of a circle and its circumference?
- table of values:
- The type and amount of data to be collected is based on the questions of interest. Questions of interest involving two variables require two sets of data to be collected from the same sample or population.
- Depending on the question of interest, the continuous data may need to be collected from a primary or a secondary source.
- Depending on the question of interest, a random sample of the population may need to be taken. Types of sampling methods include simple random sampling, stratified random sampling, and systematic random sampling.
- A table of values for a scatter plot is a list of corresponding values of two variables for each subject in a sample or population.
Note
- A scatter plot does not guarantee that there is a relationship between the two variables. Therefore, one variable can only depend on another variable if there is a relationship between the two variables (i.e., for a relation, the x-value in the set of ordered pairs (x, y) is the independent variable and the y-value is the dependent variable).
- Many science experiments involve the relationship between two variables. The independent variable is what the researcher gets to change, and the dependent variable is what the researcher gets to observe or measure during the experiment.
Have students pick a question of interest from a list containing a variety of questions that you share, such as:
- Is there a relationship between the width (cm) and height (cm) of a computer screen?”
- Is there a relationship between the height and width of flags? Students could investigate this question using a variety of flags from around the world and discover that the most common ratio is 2:3 (e.g., France and Japan), followed by 1:2 (e.g., Canada and Jamaica).
Providing students with examples of possible questions of interest can support them in coming up with their own.
Data Visualization
D1.3
select from among a variety of graphs, including scatter plots, the type of graph best suited to represent various sets of data; display the data in the graphs with proper sources, titles, and labels, and appropriate scales; and justify their choice of graphs
- types of graphs involving one variable:
- types of graphs involving two variables:
- Scatter plots are used to display data points for two continuous variables. The horizontal axis identifies the possible values for one variable and the vertical axis identifies the possible values for the other variable.
- Broken-line graphs are used to show change over time. The value on the horizontal axis is usually time. One or none of the variables are continuous.
- Circle graphs are used to show how categories represent a part of the whole data set for one variable.
- Histograms display data as intervals of numeric data and their frequencies for one variable that is continuous.
- Pictographs, line plots, bar graphs, multiple-bar graphs, and stacked-bar graphs may be used to display qualitative data and discrete data, and their corresponding frequencies for one variable.
Note
- Data that is represented in a table of values displays two pieces of information for each subject in the sample or population. These two pieces of information can be graphed together using a scatter plot. Also, each piece of information can be treated separately and represented using another type of graph, such as a histogram or a circle graph.
Provide students with a variety of data sets related to different questions of interest. Have them create a graph for each data set and provide a rationale for their choice of graph. This type of activity will consolidate students’ learning about which graphs are appropriate for different types of data and will provide opportunities for them to make sense of graphs they may encounter in everyday life.
D1.4
create an infographic about a data set, representing the data in appropriate ways, including in tables and scatter plots, and incorporating any other relevant information that helps to tell a story about the data
- infographic on the topic of “Let’s Get Moving”:
- Infographics are used in real life to share data and information on a topic in a concise, clear, and appealing way.
- Infographics contain different representations, such as tables, plots, and graphs, with limited text such as quotes.
- Information to be included in an infographic needs to be carefully considered and presented so that it is clear and concise. Infographics tell a story about the data with a specific audience in mind. When creating infographics, students need to create a narrative about the data for that audience.
Note
- Creating infographics has applications in other subject areas, such as communicating key findings and messages in STEM projects.
To deepen their understanding of what an infographic is and its purpose, have students examine the features and messages of an infographic, such as the “Let’s Get Moving!” infographic found in the examples for D1.4. Ask questions such as:
- What audience do you think the infographic was intended for?
- What messages do you think the author was trying to share?
- What data visualizations has the author used? Why do you think they were chosen?
Have students create an infographic for previously collected data to share information. For example, students might want to share information with the parent council on a relevant topic. Ask students to identify their audience, what message(s) they want to get across, what data visualization techniques they will use, and any other information that will help them to share their message. Have students share their ideas with a peer to check that their message is coming through before they share with the intended audience.
Data Analysis
D1.5
use mathematical language, including the terms “strong”, “weak”, “none”, “positive”, and “negative”, to describe the relationship between two variables for various data sets with and without outliers
- types of relationships:
- When data points form close to a line or a curve, this indicates that there is a strong relationship between the variables.
- When data points are in a cluster, this indicates that there is no relationship.
- The scatter plot of a weak relationship between two variables shows points that are more spread out than those showing a strong relationship.
- The scatter plot of a positive relationship shows points going upwards from the origin and to the right. The scatter plot of a negative relationship shows points going down from the y-axis to the x-axis.
- If a data set has outliers, something may have gone wrong in the data collection (or measurement). This requires further investigation. It may represent a valid, unexpected piece of the population needing further clarification. If the investigation uncovers an error, the researcher should fix it. If the data turns out to be from an individual that is not part of the population, then it should be removed. If none of these are uncovered, then re-sampling may be needed.
Note
- A line of best fit or a curve of best fit can be drawn through the majority of the points and used to make predictions where there is a strong relationship between the two variables.
Have students create a scatter plot of data they have collected to answer a question of interest that involves two variables. Then have them describe the strength of the relationship between the two variables. For those that have a strong relationship, have students make predictions about other possible data points.
D1.6
analyse different sets of data presented in various ways, including in scatter plots and in misleading graphs, by asking and answering questions about the data, challenging preconceived notions, and drawing conclusions, then make convincing arguments and informed decisions
- data presented in variety of ways:
- question that requires reading and interpreting data from a graph or table:
- Use the scatter plot to determine the optimal distance to stand from a basketball net in order to make the most baskets.
- What is the circumference of a circle that has a diameter of 3 cm?
- question that requires finding data from a graph or table and using it in a calculation:
- Approximately how much bigger is Toronto than Ottawa?
- If the diameter of a circle increases by 2 cm, how will the circumference change?
- question that requires using data to make an inference or prediction:
- If a person doubled their height, would you expect that their arm span would also double? Explain your thinking.
- Do you think the relationship between the diameter and the circumference of a circle holds true for all circles? Why or why not?
- Scatter plots are used to determine whether a relationship exists between two numerical variables. Analysis of the scatter plot requires identifying how closely the points are to forming a line or curve in order to conclude that there is a relationship.
- The range and the measures of central tendency may be used to analyse data involving one variable.
- Sometimes graphs misrepresent data or show it inappropriately, which could influence the conclusions that we make about it. Therefore, it is important to always interpret presented data with a critical eye.
- Data presented in tables, plots, and graphs can be used to ask and answer questions, draw conclusions, and make convincing arguments and informed decisions.
- Sometimes presented data challenges current thinking and leads to new and different conclusions and decisions.
- Questions of interest are intended to be answered through the analysis of the representations. Sometimes the analysis raises more questions that require further collection, representation, and analysis of data.
Note
- There are three levels of graph comprehension that students should learn about and practise:
- Level 1: information is read directly from the graph and no interpretation is required.
- Level 2: information is read and used to compare (e.g., greatest, least) or perform operations (e.g., addition, subtraction).
- Level 3: information is read and used to make inferences about the data using background knowledge of the topic.
Show students the “Cost of Services” graph below, and ask them what they notice and wonder. Support students in recognizing that the graph can be misleading because it presents one-dimensional information using three-dimensional cubes, making the difference between the data points look deceptively large. Students can use their knowledge of volume to determine that the last cube is 64 times as large as the first one. Yet the last money amount is only about four times as large as the first one.
Ask students to look at the data they collected for their question of interest. Ask them what they notice about the data. For example, are there any surprises in the data, and, if so, what might explain them? Then ask students whether their data helps them answer their question or whether it raises more questions. Ask students whether they can begin to make any conclusions from their data and how they might use the data to make convincing arguments and informed decisions. If students do not have enough data, have them identify what further information they need to answer their question of interest.