Data Analysis and Decision Making

Expert-defined terms from the Professional Certificate in Quality Assurance in Higher Education course at Stanmore School of Business. Free to read, free to share, paired with a globally recognised certification pathway.

Data Analysis and Decision Making

Data Analysis and Decision Making Glossary #

Data Analysis and Decision Making Glossary

1. Data Analysis #

Data analysis is the process of inspecting, cleaning, transforming, and modeling… #

It involves the use of statistical and mathematical techniques to analyze and interpret data.

- Descriptive Statistics: Descriptive statistics are used to summarize and descr… #

Examples include mean, median, mode, range, and standard deviation.

- Inferential Statistics: Inferential statistics are used to make predictions or… #

- Inferential Statistics: Inferential statistics are used to make predictions or inferences about a population based on a sample of data.

- Data Visualization: Data visualization is the graphical representation of data… #

- Data Visualization: Data visualization is the graphical representation of data to help users understand complex data patterns and trends.

2. Decision Making #

Decision making is the process of choosing between alternative courses of action… #

It involves evaluating options and selecting the most suitable one to achieve a specific goal or outcome.

- Decision Support Systems: Decision support systems are computer-based tools th… #

- Decision Support Systems: Decision support systems are computer-based tools that help decision-makers analyze data and information to make informed decisions.

- Risk Management: Risk management involves identifying, assessing, and prioriti… #

- Risk Management: Risk management involves identifying, assessing, and prioritizing risks to minimize their impact on decision-making processes.

- Cost-Benefit Analysis: Cost-benefit analysis is a technique used to compare th… #

- Cost-Benefit Analysis: Cost-benefit analysis is a technique used to compare the costs of a decision with the benefits it will provide.

3. Regression Analysis #

Regression analysis is a statistical technique used to examine the relationship… #

It helps in understanding how the value of the dependent variable changes when one or more independent variables are varied.

- Linear Regression: Linear regression is a type of regression analysis where th… #

- Linear Regression: Linear regression is a type of regression analysis where the relationship between the dependent variable and independent variable(s) is modeled as a linear equation.

- Multiple Regression: Multiple regression is a form of regression analysis that… #

- Multiple Regression: Multiple regression is a form of regression analysis that examines the relationship between one dependent variable and two or more independent variables.

- Logistic Regression: Logistic regression is a regression analysis technique us… #

g., yes/no, 0/1).

4. Hypothesis Testing #

Hypothesis testing is a statistical method used to make inferences about a popul… #

It involves formulating a null hypothesis and an alternative hypothesis, collecting data, and performing statistical tests to determine whether the null hypothesis should be rejected.

- Type I Error: Type I error occurs when the null hypothesis is incorrectly reje… #

It is also known as a false positive.

- Type II Error: Type II error occurs when the null hypothesis is incorrectly ac… #

It is also known as a false negative.

- Significance Level: The significance level is the probability of rejecting the… #

It is denoted by alpha (α) and is typically set at 0.05.

5. Cluster Analysis #

Cluster analysis is a data mining technique used to group a set of objects in su… #

It helps in identifying patterns and relationships in data.

- Hierarchical Clustering: Hierarchical clustering is an alternative clustering… #

- Hierarchical Clustering: Hierarchical clustering is an alternative clustering method that creates a tree of clusters, known as a dendrogram, to represent the relationships between data points.

- Density-Based Clustering: Density-based clustering is a method that groups tog… #

- Density-Based Clustering: Density-based clustering is a method that groups together points that are closely packed in a high-density region.

6. Time Series Analysis #

Time series analysis is a statistical technique used to analyze and interpret ti… #

It involves studying the patterns, trends, and cycles in the data to make forecasts and predictions.

- Autocorrelation: Autocorrelation is a measure of the correlation between value… #

- Autocorrelation: Autocorrelation is a measure of the correlation between values of a time series at different points in time.

- Moving Average: A moving average is a technique used to smooth out fluctuation… #

- Moving Average: A moving average is a technique used to smooth out fluctuations in time series data by calculating the average of a subset of data points.

- Seasonal Decomposition: Seasonal decomposition is a method used to separate a… #

- Seasonal Decomposition: Seasonal decomposition is a method used to separate a time series into its trend, seasonal, and residual components.

7. Data Mining #

- Association Rules: Association rules are if-then statements that identify rela… #

- Association Rules: Association rules are if-then statements that identify relationships between variables in a dataset.

- Classification: Classification is a data mining technique used to predict the… #

- Classification: Classification is a data mining technique used to predict the class or category of a new observation based on training data.

- Clustering: Clustering is a data mining technique used to group similar object… #

- Clustering: Clustering is a data mining technique used to group similar objects together based on their characteristics.

8. Forecasting #

Forecasting is the process of making predictions about future events based on hi… #

It involves analyzing past patterns and using them to estimate future outcomes.

- Time Series Forecasting: Time series forecasting is a specific type of forecas… #

- Time Series Forecasting: Time series forecasting is a specific type of forecasting that involves predicting future values of a time series based on historical data.

- Exponential Smoothing: Exponential smoothing is a technique used in time serie… #

- Exponential Smoothing: Exponential smoothing is a technique used in time series forecasting to give more weight to recent observations.

- ARIMA Models: ARIMA (AutoRegressive Integrated Moving Average) models are a cl… #

- ARIMA Models: ARIMA (AutoRegressive Integrated Moving Average) models are a class of models used for time series analysis and forecasting.

9. Data Quality #

Data quality refers to the accuracy, completeness, consistency, and reliability… #

It is essential for ensuring that data analysis and decision-making processes are based on high-quality, trustworthy data.

- Data Cleansing: Data cleansing is the process of detecting and correcting erro… #

- Data Cleansing: Data cleansing is the process of detecting and correcting errors and inconsistencies in data to improve its quality.

- Data Governance: Data governance is the overall management of the availability… #

- Data Governance: Data governance is the overall management of the availability, usability, integrity, and security of data within an organization.

- Data Profiling: Data profiling is the process of analyzing data to gain an und… #

- Data Profiling: Data profiling is the process of analyzing data to gain an understanding of its structure, content, and quality.

10. Data Visualization #

Data visualization is the graphical representation of data to help users underst… #

It involves creating visualizations such as charts, graphs, and maps to communicate insights effectively.

- Bar Chart: A bar chart is a graphical representation of data where bars of var… #

- Bar Chart: A bar chart is a graphical representation of data where bars of varying lengths are used to show the values of different categories.

- Scatter Plot: A scatter plot is a graphical representation of data points on a… #

- Scatter Plot: A scatter plot is a graphical representation of data points on a two-dimensional plane to show the relationship between two variables.

- Heat Map: A heat map is a graphical representation of data where values are re… #

- Heat Map: A heat map is a graphical representation of data where values are represented by colors to show patterns and trends.

11. Statistical Analysis #

Statistical analysis is the process of collecting, exploring, analyzing, and int… #

It involves applying statistical techniques to draw meaningful insights from data.

- Central Limit Theorem: The central limit theorem states that the sampling dist… #

- Central Limit Theorem: The central limit theorem states that the sampling distribution of the sample mean approaches a normal distribution as the sample size increases.

- Confidence Interval: A confidence interval is a range of values within which t… #

- Confidence Interval: A confidence interval is a range of values within which the true population parameter is likely to lie with a certain level of confidence.

- Hypothesis Testing: Hypothesis testing is a statistical method used to make in… #

- Hypothesis Testing: Hypothesis testing is a statistical method used to make inferences about a population based on sample data.

12. Machine Learning #

Machine learning is a branch of artificial intelligence that focuses on the deve… #

It involves training models on data to make decisions without being explicitly programmed.

- Supervised Learning: Supervised learning is a machine learning technique where… #

- Supervised Learning: Supervised learning is a machine learning technique where the model is trained on labeled data to make predictions.

- Unsupervised Learning: Unsupervised learning is a machine learning technique w… #

- Unsupervised Learning: Unsupervised learning is a machine learning technique where the model is trained on unlabeled data to find patterns and relationships.

- Deep Learning: Deep learning is a subset of machine learning that uses neural… #

- Deep Learning: Deep learning is a subset of machine learning that uses neural networks with multiple layers to learn complex patterns in data.

13. Exploratory Data Analysis #

Exploratory data analysis is an approach to analyzing data sets to summarize the… #

It helps in understanding the underlying structure, patterns, and relationships in data.

- Box Plot: A box plot is a graphical representation of the distribution of a da… #

- Box Plot: A box plot is a graphical representation of the distribution of a dataset that includes the median, quartiles, and outliers.

- Histogram: A histogram is a graphical representation of the frequency distribu… #

- Histogram: A histogram is a graphical representation of the frequency distribution of a dataset where bars of varying heights represent the frequency of data points.

- Correlation Analysis: Correlation analysis is a statistical method used to mea… #

- Correlation Analysis: Correlation analysis is a statistical method used to measure the strength and direction of the relationship between two variables.

14. Data Preprocessing #

Data preprocessing is the process of cleaning, transforming, and preparing raw d… #

It involves handling missing values, removing outliers, and standardizing data to ensure its quality and suitability for analysis.

- Feature Scaling: Feature scaling is a technique used to standardize the range… #

- Feature Scaling: Feature scaling is a technique used to standardize the range of independent variables in a dataset to ensure equal importance during analysis.

- Dimensionality Reduction: Dimensionality reduction is a technique used to redu… #

- Dimensionality Reduction: Dimensionality reduction is a technique used to reduce the number of input variables in a dataset while preserving as much information as possible.

- Data Imputation: Data imputation is the process of filling in missing values i… #

- Data Imputation: Data imputation is the process of filling in missing values in a dataset using statistical methods or machine learning algorithms.

15. Data Warehousing #

Data warehousing is the process of collecting, storing, and managing large volum… #

It involves integrating data from multiple sources into a centralized repository for analysis.

- Extract, Transform, Load (ETL): ETL is a process used to extract data from sou… #

- Extract, Transform, Load (ETL): ETL is a process used to extract data from source systems, transform it into a suitable format, and load it into a data warehouse.

- Data Mart: A data mart is a subset of a data warehouse that is designed for a… #

- Data Mart: A data mart is a subset of a data warehouse that is designed for a specific department or business unit.

- Data Warehouse Architecture: Data warehouse architecture refers to the design… #

- Data Warehouse Architecture: Data warehouse architecture refers to the design and structure of a data warehouse, including data storage, processing, and access layers.

16. Big Data Analytics #

Big data analytics is the process of examining large and complex datasets to unc… #

It involves using advanced analytics techniques to extract value from massive volumes of data.

- Hadoop: Hadoop is an open-source framework for distributed storage and process… #

- Hadoop: Hadoop is an open-source framework for distributed storage and processing of big data across clusters of computers.

- MapReduce: MapReduce is a programming model used for processing and generating… #

- MapReduce: MapReduce is a programming model used for processing and generating large datasets in parallel across distributed systems.

- Data Lake: A data lake is a storage repository that holds a vast amount of raw… #

- Data Lake: A data lake is a storage repository that holds a vast amount of raw data in its native format until it is needed for analysis.

17. Text Mining #

Text mining is a data mining technique used to extract valuable information, pat… #

It involves analyzing text documents to discover trends, sentiment, and relationships.

- Natural Language Processing (NLP): Natural Language Processing is a branch of… #

- Natural Language Processing (NLP): Natural Language Processing is a branch of artificial intelligence that focuses on enabling computers to understand, interpret, and generate human language.

- Sentiment Analysis: Sentiment analysis is a text mining technique used to dete… #

- Sentiment Analysis: Sentiment analysis is a text mining technique used to determine the sentiment expressed in a piece of text, such as positive, negative, or neutral.

- Topic Modeling: Topic modeling is a text mining technique used to identify top… #

- Topic Modeling: Topic modeling is a text mining technique used to identify topics or themes in a collection of text documents.

18. Data Governance #

Data governance is the overall management of the availability, usability, integr… #

It involves establishing policies, processes, and controls to ensure that data is managed effectively and in compliance with regulations.

- Data Stewardship: Data stewardship is the role responsible for managing and en… #

- Data Stewardship: Data stewardship is the role responsible for managing and ensuring the quality and security of data within an organization.

- Data Quality Management: Data quality management is the process of defining, m… #

- Data Quality Management: Data quality management is the process of defining, monitoring, and improving the quality of data to ensure its accuracy and reliability.

- Data Privacy: Data privacy refers to the protection of personal information an… #

- Data Privacy: Data privacy refers to the protection of personal information and sensitive data from unauthorized access or disclosure.

19. Data Security #

Data security is the practice of protecting data from unauthorized access, use,… #

It involves implementing security measures to ensure the confidentiality, integrity, and availability of data.

- Encryption: Encryption is the process of converting data into a coded format t… #

- Encryption: Encryption is the process of converting data into a coded format that can only be decoded with a key or password.

- Access Control: Access control is the process of restricting access to data ba… #

- Access Control: Access control is the process of restricting access to data based on user authentication, authorization, and permissions.

- Data Breach: A data breach is an incident where sensitive or confidential data… #

- Data Breach: A data breach is an incident where sensitive or confidential data is accessed, stolen, or exposed without authorization.

20. Data Ethics #

Data ethics refers to the moral principles and guidelines governing the collecti… #

It involves ensuring that data is handled responsibly, ethically, and in compliance with legal and regulatory requirements.

- Privacy by Design: Privacy by Design is a framework that promotes embedding pr… #

- Privacy by Design: Privacy by Design is a framework that promotes embedding privacy and data protection considerations into the design and operation of systems, products, and services.

- Fairness in Machine Learning: Fairness in machine learning involves ensuring t… #

- Fairness in Machine Learning: Fairness in machine learning involves ensuring that algorithms are unbiased and do not discriminate against individuals based on protected characteristics.

- Data Anonymization: Data anonymization is the process of removing or encryptin… #

- Data Anonymization: Data anonymization is the process of removing or encrypting personally identifiable information from datasets to protect the privacy of individuals.

21. Data Governance Framework #

A data governance framework is a structured approach to managing and controlling… #

It includes policies, procedures, roles, and responsibilities to ensure that data is managed effectively and securely.

- Data Governance Council: A data governance council is a group of stakeholders… #

- Data Governance Council: A data governance council is a group of stakeholders responsible for setting data governance policies, guidelines, and priorities within an organization.

- Data Governance Maturity Model: A data governance maturity model is a framewor… #

- Data Governance Maturity Model: A data governance maturity model is a framework used to assess the effectiveness and maturity of an organization's data governance practices.

- Data Governance Tools: Data governance tools are software applications used to… #

- Data Governance Tools: Data governance tools are software applications used to support and automate data governance processes, such as data quality, metadata management, and compliance.

22. Data Visualization Tools #

Data visualization tools are software applications used to create graphical repr… #

They include a variety of charts, graphs, and dashboards for visualizing data.

- Power BI: Power BI is a business analytics tool by Microsoft that enables user… #

- Power BI: Power BI is a business analytics tool by Microsoft that enables users to visualize and share insights from their data through interactive dashboards and reports.

- Data Studio: Data Studio is a free data visualization tool by Google that allo… #

- Data Studio: Data Studio is a free data visualization tool by Google that allows users to create custom reports and dashboards using data from various sources.

23. Data #

driven Decision Making:

Data #

driven decision making is an approach to making decisions based on data analysis and evidence rather than intuition or personal judgment. It involves using data to inform and support decision-making processes.

- Business Intelligence: Business intelligence is the use of data analysis tools… #

- Business Intelligence: Business intelligence is the use of data analysis tools and techniques to transform data into actionable insights for making informed business decisions.

- Key Performance Indicators (KPIs): Key Performance Indicators are quantifiable… #

- Key Performance Indicators (KPIs): Key Performance Indicators are quantifiable metrics used to evaluate the success of an organization, project, or process.

- Data-driven Culture: A data-driven culture is an organizational mindset that v… #

- Data-driven Culture: A data-driven culture is an organizational mindset that values and prioritizes data-driven decision making across all levels of the organization.

24. Data Warehouse #

A data warehouse is a centralized repository that stores large volumes of struct… #

It is designed to support decision-making processes by providing a single source of truth for data.

- Data Mart: A data mart is a subset of a data warehouse that is designed for a… #

- Data Mart: A data mart is a subset of a data warehouse that is designed for a specific department or business unit within an organization.

- Data Warehouse Architecture: Data warehouse architecture refers to the design… #

- Data Warehouse Architecture: Data warehouse architecture refers to the design and structure of a data warehouse, including data storage, processing, and access layers.

- Data Warehouse Schema: A data warehouse schema is the logical structure that d… #

- Data Warehouse Schema: A data warehouse schema is the logical structure that defines how data is organized and stored in a data warehouse.

25. Data Mining Techniques #

Data mining techniques are methods and algorithms used to extract patterns, tren… #

They include a variety of statistical and machine learning techniques for analyzing and interpreting data.

- Association Rule Mining: Association rule mining is a technique used to discov… #

- Association Rule Mining: Association rule mining is a technique used to discover relationships between variables in a dataset.

- Clustering: Clustering is a data mining technique used to group similar object… #

- Clustering: Clustering is a data mining technique used to group similar objects together based on their characteristics.

- Classification: Classification is a data mining technique used to predict the… #

- Classification: Classification is a data mining technique used to predict the class or category of a new observation based on training data.

26. Data Integration #

Data integration is the process of combining data from different sources and for… #

It involves transforming and harmonizing data to ensure consistency and accuracy across the organization.

- Extract, Transform, Load (ETL): ETL is a process used to extract data from sou… #

- Extract, Transform, Load (ETL): ETL is a process used to extract data from source systems, transform it into a suitable format, and load it into a target database.

- Data Migration: Data migration is the process of moving data from one system o… #

- Data Migration: Data migration is the process of moving data from one system or platform to another while maintaining its integrity and consistency.

- Master Data Management: Master Data Management is a process that ensures the u… #

- Master Data Management: Master Data Management is a process that ensures the uniformity, accuracy, and consistency of an organization's critical data assets.

27. Data Architecture #

Data architecture is the design and structure of data assets within an organizat… #

It involves defining data models, standards, and policies to ensure that data is managed effectively.

- Data Model: A data model is a visual representation of how data is organized a… #

- Data Model: A data model is a visual representation of how data is organized and stored within a database or data warehouse.

- Data Dictionary: A data dictionary is a central repository that defines and de… #

- Data Dictionary: A data dictionary is a central repository that defines and describes the data elements, attributes, and relationships in a database.

- Data Governance Framework: A data governance framework is a structured approac… #

- Data Governance Framework: A data governance framework is a structured approach to managing and controlling data assets within an organization.

28. Data Mining Software #

Data mining software is a type of application that enables users to extract patt… #

It includes tools for data preparation, modeling, and visualization to support data mining activities.

- RapidMiner: RapidMiner is an open-source data science platform that offers a w… #

- RapidMiner: RapidMiner is an open-source data science platform that offers a wide range of tools for data preparation, machine learning, and predictive analytics.

- KNIME: KNIME is an open-source data analytics platform that allows users to cr… #

- KNIME: KNIME is an open-source data analytics platform that allows users to create visual workflows for data mining, analysis, and reporting.

29. Data Modeling #

Data modeling is the process of creating a visual representation of data structu… #

It helps in defining and organizing data to support business requirements.

- Entity-Relationship Diagram (ERD): An entity- #

- Entity-Relationship Diagram (ERD): An entity-

May 2026 cohort · 29 days left
from £99 GBP
Enrol