Data Literacy
“Data Literacy” is the ability to identify, organize and structure data sets in such a way to be able to read, write and communicate data in the appropriate context.
The need for data literacy is being accelerated by the driving demand for data literacy and critical thinking. While we are continuously being exposed to data (data analysis) in our personal lives, data literacy is becoming a new workplace requirement. Credible data is necessary to make informed decisions.
The goal of the International Association for Data Quality, Governance and Analytics (IADQGA) is committed to satisfying these database needs. The approach will be broad to include data driven reasoning and thinking. Data literacy will involve information that is communicated through text, maps, graphical and tabular displays, numbers and symbols. The IADQGA will seek to satisfy the needs of students and working professionals in all fields by providing the basic underlying knowledge and advanced concepts and techniques.
The level of data competence varies by profession and job classifications but strong basic data knowledge is a required skill.
Data literacy can be segmented into three (3) levels:
Basic data literacy will have an awareness and understanding of the data request, sources of data used (survey or actual process data), data types – structured vs unstructured, validation techniques of the sources of data
Intermediate data literacy will understand the importance of data dictionaries, understand how to use basic Excel features in the process of cleaning and organizing data
Advanced data literacy will be able to understand the more sophisticated data terminology, understand how to use advanced techniques in the cleaning and organizing data using advanced Excel features, how to integrate continuous external data feeds, understand the data needs for Artificial Intelligence (AI) & Machine Learning (ML)
The IADQGA’s unified focus (Data & Statistical Literacy) is to provide its members with:
Basic data awareness
The importance of data quality
A working knowledge and understanding of basic data concepts, classification and terminology
Basic data collection techniques
Basic descriptive statistics
Basic predictive statistics
Statistical result interpretation
Basic statistical result communications & visualization
Data Quality Traits
There are 5 key data quality traits that must be understood and taken into consideration when working with data. These quality traits are:
Accuracy
Completeness
Reliability
Relevance
Timeliness
The results of any statistical analysis can be significantly jeopardized if these quality traits are not thoroughly understood and taken into consideration in any data analysis or use the data in any Artificial Intelligence (AI) or Machine Learning (ML) applications.
Data Cleansing and Organizing
The challenge of cleaning and organizing data typically overwhelms most organizations and is a major contributor to many digitization and data analytic initiatives. Given the overwhelming availability and familiarity of Excel, Excel will be used to demonstrate and perform data cleaning and organization techniques on data sets up to the size limitations within Excel.
Data and statistical literacy are components of a science of method – (the sciences of method study on how we think – Rand, 1966). The sciences of method can be classified by their focus (words vs numbers) and by their method of deductive vs inductive reasoning.
Method of Reasoning
Understanding the Data Request
A clear understanding of the intended use of the requested data is critical. Such understandings include:
Will the data be used for historical analysis or predictive modeling?
What level of data is required?
Is there a data dictionary in use for the data?
Is the data categorical or numeric?
Is the data discrete or continuous?
Is the data in intervals or ratios?
When was the data collected?
How often is the available data collected?
How accessible is the data?
How much data is required? (Machine Learning - ML requires large amounts of data)
Will visualization be required of the data (Human visualization is a very high bandwidth channel to our brains and allows for rapid interpretation of large amounts of data and inferences.)
Critical Thinking
Critical thinking is required by every organization whether for profit or not-for- profit. An organizations ability to have its members at all levels of the organization to perform critical thinking is critical to an organizations ongoing success.
Those organizations that have the ability to perform critical thinking will differentiate themselves and will command a superior competitive position.
Both “Data” and ‘Statistical” literacy are the building blocks for critical thinking.