Data Analytics Glossary

Main concepts
Terms
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Dimensionality Reduction
Learn about dimensionality reduction, a crucial data science concept that involves transforming high-dimensional data into a lower-dimensional space while preserving the most meaningful properties of the original data. Find out how it can help simplify data for processing, analysis, and visualization, and how it can be applied in various business scenarios.
Read
Distributions
Learn about statistical distributions in data analytics with this comprehensive guide from Graphext. Find out what they are, how they work, and why they're important for analyzing and interpreting data.
Read
Dremio
Dremio is a data platform designed for data analysts to interact with data in various ways, including SQL, Python, R, and more. It allows users to connect to various data sources, such as SQL databases, NoSQL databases, cloud storage platforms, and Hadoop systems, and enables them to transform and analyze data in real-time. Learn more about Dremio's self-service data exploration experience, fast data processing, and open-source foundation for optimizing your data analysis capabilities.
Read
Druid
Druid is an open-source data store designed for real-time analytics, optimized for handling complex queries on large data sets. Druid's scalability and ability to handle large data sets make it ideal for businesses that deal with high volumes of data. With Druid, businesses can store and analyze large amounts of data in real-time, allowing them to make quick and informed decisions based on their data. Learn more about Druid and its capabilities in real-time analytics.
Read
E-Commerce Analytics
Learn how e-commerce analytics can help you grow your online business. Discover advanced analytics use cases and tools to optimize your sales and marketing strategy. Explore the Graphext glossary for more insights.
Read
ETL
Learn about ETL, which stands for Extract, Transform, and Load, a crucial process in data engineering that ensures data is properly integrated and formatted for use by various applications and systems. Explore the key stages of ETL and how it can be applied to businesses to gain insights into customer behavior, optimize inventory levels, and improve overall business performance.
Read
Embedding
Learn about embedding, a common data science concept used in natural language processing, computer vision, and recommendation systems. Discover how it maps high-dimensional data into a low-dimensional space while preserving essential information. Explore its applications in businesses, including recommendation systems, natural language processing, and customer data clustering, to identify patterns and insights that inform business decisions.
Read
Exploratory Data Analysis
Exploratory Data Analysis (EDA) is a fundamental process in data analysis that involves the initial exploration of data to uncover hidden patterns, anomalies, and relationships. Learn how to apply EDA to gain insights into your data and make informed, data-driven decisions.
Read
Feature
Learn about the importance of features in data science and how they contribute to building relationships between data points and developing predictive models. Discover how businesses can use features to identify patterns in customer behavior, predict customer preferences, and make data-driven decisions that optimize operations and improve the bottom line. Explore key highlights and resources to learn more about feature engineering in machine learning and how it can be applied to business.
Read
Feature Selection
Learn about feature selection, a crucial concept in data science, and its relevance for using Graphext's predictive models. Created by Victoriano Izquierdo.
Read
Forecast Accuracy
This article provides an overview of forecast accuracy, including its definition, calculation methods, factors affecting accuracy, common problems and solutions, uses, and ways to improve accuracy.
Read
Geospatial Analysis
Learn about geospatial analysis and how it can be used to analyze complex spatial data in fields such as urban planning, environmental science, and logistics. Discover how businesses can gain insights into customer behavior, market trends, and resource allocation using geospatial analysis.
Read
Gephi
Learn about Gephi, an advanced software tool for creating and analyzing graphs. Discover how it can be used with Graphext and its relevant features. Created by Victoriano Izquierdo and last edited on June 13, 2023.
Read
Google Data Studio
Google Data Studio is a free web-based data visualization tool that allows you to easily create custom reports and dashboards using various data sources. Learn how businesses can use Data Studio to make data-driven decisions and inform business strategy. Explore the Google Data Studio Help Center, Gallery, and YouTube channel to discover more.
Read
Graph
A graph is a mathematical concept used to represent a set of nodes or vertices connected by edges or links. In data science, graphs can be used to visualize complex relationships between variables in a dataset, identify groups of related nodes or vertices, and understand patterns of interaction between entities. Learn more about graphs and their applications in data science here.
Read
HDBCAN
Learn about HDBCAN - Hierarchical Density-Based Spatial Clustering of Applications with Noise. Discover how this clustering technique can help businesses analyze high-dimensional data and detect patterns and trends in complex datasets. HDBCAN is useful for customer segmentation, fraud detection, network analysis, and many other business applications.
Read
HEX
HEX is a software tool in the field of data analysis and visualization, created by Victoriano Izquierdo. Learn more about this expert-level tool on the Graphext glossary page.
Read
Hierarchical Clustering
Learn about hierarchical clustering, a data science concept used for clustering and graphing data. This article provides an expert-level overview and is relevant for using Graphext. Created by Victoriano Izquierdo.
Read
Histograms
Learn about histograms in data analytics. Understand the basics of how histograms work and how they can be used to analyze data. Explore more at Graphext.
Read
Hypothesis Testing
Learn about hypothesis testing in data analytics. Understand its basic concepts and how it is used in statistical analysis. Read more on our website.
Read
JSON
Learn about JSON, a lightweight data-interchange format used for exchanging data between systems and applications. Discover how JSON can help businesses store and analyze data in a structured and organized way, improving data quality and reducing errors. Read more on Graphext's glossary.
Read
Jupyter Notebooks
Jupyter Notebooks is an open-source web application that allows data scientists, analysts, and researchers to write and execute code in a web-based environment. With support for various programming languages, Jupyter Notebook is a powerful tool for businesses that need to perform data analysis, machine learning, and exploratory computing. Learn more about Jupyter Notebooks and how it can improve your data analysis capabilities.
Read
K-Means
K-Means is a popular clustering algorithm used in data science to identify patterns in data. This advanced data science concept is easy to implement and can be used for a wide range of applications, including image segmentation, customer segmentation, and fraud detection. Learn more about K-Means and its business applications in this article.
Read
K-Nearest Neighbours
K-Nearest Neighbors (KNN) is a non-parametric algorithm used for both regression and classification tasks in supervised learning. Learn about its key highlights and applications in business.
Read