
Common Terminology in Adversarial Machine Learning
Artificial Intelligence (AI) and Machine Learning (ML) have vast applications in the cyber space. With its quick adoption and limitless possibilities, the industry is in need of authorities who can provide expertise and perspective to help guide other professionals in their exploration of Large Language Models (LLMs). One of the best ways to start learning a new area is by studying the common terminology practitioners use. We created this glossary of terms to help anyone researching AI and ML to gain an understanding of discussions around Adversarial Machine Learning.
Artificial Intelligence (AI) versus Machine Learning (ML)
Before we dive in, let’s level set on the differences between AI and ML, or perhaps the lack thereof.
Artificial Intelligence
Artificial Intelligence is a broader field that focuses on creating machines that can perform tasks that typically require human intelligence. It aims to build systems that can reason, learn, perceive, and understand natural language, among other capabilities. AI encompasses various techniques, and machine learning is one of its subfields.
Machine Learning
Machine Learning is a subset of AI that deals with designing algorithms and models that enable computers to learn from data without explicit programming. Instead of being programmed with specific rules, ML models use patterns and examples to improve their performance on a given task. ML can be further divided into different categories, such as supervised learning, unsupervised learning, and reinforcement learning, each suited for different types of learning tasks.
While they are closely related areas, they do have nuanced differences. To put it concisely, AI is a broader field that encompasses various techniques and methods to create intelligent systems, while ML is a specific approach within AI that focuses on learning from data to improve task performance.
At this point in time, definitions within the realm of Adversarial Machine Learning (AML) lack standardization. We recognize the significance of setting clear and robust definitions to shape the future of AML, which is why our team is integrated in refining and solidifying these definitions to help establish industry standards. By leveraging NetSPI’s expertise and in-house knowledge, we strive to present definitions that are not only comprehensive but also accurate and relevant to the current state of AML.
Key Terminology in AI Cybersecurity
Term | Definition |
---|---|
Adversarial Attacks | Techniques employed to create adversarial examples and exploit the vulnerabilities of machine learning models. |
Adversarial Example Detection | Methods designed to distinguish adversarial examples from regular clean examples and prevent their misclassification. |
Adversarial Examples | AML hinges on the idea that machine learning models can be deceived and manipulated by subtle modifications to input data, known as adversarial examples. These adversarial examples are carefully crafted to cause the model to misclassify or make incorrect predictions, leading to potentially harmful consequences. Adversarial attacks can have significant implications, ranging from evading spam filters and malware detection systems to fooling autonomous vehicles’ object recognition systems. |
Adversarial Learning/Training | A learning approach that involves training models to be robust against adversarial examples or actively generating adversarial examples to evaluate the model’s vulnerability. |
Adversarial Machine Learning (AML) | A field that focuses on studying the vulnerabilities of machine learning models to adversarial attacks and developing strategies to enhance their security and robustness. |
Adversarial Perturbations | Small, carefully crafted changes to the input data that are imperceptible to humans but can cause significant misclassification by the machine learning model. |
Adversarial Robustness Evaluation | The process of assessing the robustness of a machine learning model against adversarial attacks, often involving stress testing the model with various adversarial examples. |
Adversarial Training | A defense technique involving the augmentation of the training set with adversarial examples to improve the model’s robustness. |
Autoencoders | Neural network models trained to reconstruct the input data from a compressed representation, useful for unsupervised learning and dimensionality reduction tasks. |
Batch Normalization | A technique used to improve the training stability and speed of neural networks by normalizing the inputs of each layer. |
Bias-Variance Tradeoff | The tradeoff between the model’s ability to fit the training data well (low bias) and its ability to generalize to new data (low variance). |
Black-Box Attacks | Adversarial attacks where the attacker has limited knowledge about the target model, usually through input-output interactions. |
Certified Defenses | Defense methods that provide a “certificate” guaranteeing the robustness of a trained model against perturbations within a specified bound. |
Cross-Entropy Loss | A loss function commonly used in classification tasks that measures the dissimilarity between the predicted probabilities and the true class labels. |
Data Augmentation | A technique used to increase the diversity and size of the training dataset by generating new samples through transformations of existing data. |
Decision Boundaries | The dividing lines or surfaces that separate different classes or categories in a classification problem. They define the regions in the input space where the model assigns different class labels to the data points. Decision boundaries can be linear or nonlinear, depending on the complexity of the classification problem and the algorithm used. The goal of training a machine learning model is to learn the optimal decision boundaries that accurately separate the different classes in the data. |
Defense Mechanisms | Techniques and strategies employed to protect machine learning models against adversarial attacks. |
DefenseGAN | A defense technique that uses a Generative Adversarial Network (GAN) to project adversarial perturbed images into clean images before classification. |
Deep Learning | A subfield of machine learning that utilizes artificial neural networks with multiple layers to learn hierarchical representations of data. |
Discriminative Models | Models that learn the boundary between different classes or categories in the data and make predictions based on this learned decision boundary. |
Dropout | A regularization technique where random units in a neural network are temporarily dropped out during training to prevent over reliance on specific neurons. |
Ensemble Methods | Refer to machine learning techniques that combine the predictions of multiple individual models to make more accurate and robust predictions or decisions. Instead of relying on a single model, ensemble methods leverage the diversity and complementary strengths of multiple models to improve overall performance. |
Evasion Attacks | Adversarial attacks aimed at perturbing input data to cause misclassification or evasion of detection systems. |
Feature Engineering | The process of selecting, transforming, and creating new features from the available data to improve the performance of a machine learning model. |
Generative Models | Models that learn the underlying distribution of the training data and generate new samples that resemble the original data distribution. |
Gradient Descent | An optimization algorithm that iteratively updates the model’s parameters in the direction of steepest descent of the loss function to minimize the loss. |
Gradient Masking/Obfuscation | Defense methods that intentionally hide or obfuscate the gradient information of the model to make adversarial attacks less successful. |
Gray-Box Attacks | Adversarial attacks where the attacker has partial knowledge about the target model, such as access to some internal information or limited query access. |
Hyperparameters | Parameters that are not learned from data during the training process but are set by the user before training begins. These parameters control the behavior and performance of the machine learning model. Unlike the internal parameters of the model, which are learned through optimization algorithms, hyperparameters are predefined and chosen by the user or the machine learning engineer. |
L1 and L2 Regularization | Techniques used to prevent overfitting by adding a penalty term to the model’s objective function, encouraging simplicity or smoothness. |
Mean Squared Error (MSE) | A commonly used loss function that measures the average squared difference between the predicted and true values. |
Neural Networks | Computational models inspired by the structure and functioning of the human brain, consisting of interconnected nodes (neurons) organized in layers. |
Offensive Machine Learning (OML) | The practice of leveraging machine learning techniques to design and develop attacks against machine learning systems or to exploit vulnerabilities in these systems. Offensive machine learning aims to manipulate or deceive the target models, compromising their integrity, confidentiality, or availability. |
Overfitting | A phenomenon where a machine learning model becomes too specialized to the training data and fails to generalize well to new, unseen data. |
Poisoning Attacks | Adversarial attacks involving the injection of malicious data into the training set to manipulate the behavior of the model. |
Precision and Recall | Evaluation metrics used in binary classification tasks to measure the model’s ability to correctly identify positive samples (precision) and the model’s ability to find all positive samples (recall). |
Regularization Methods | Techniques that penalize large values of model parameters or gradients during training to prevent large changes in model output with small changes in input data. |
Reinforcement Learning | A machine learning paradigm where an agent learns to take actions in an environment to maximize a cumulative reward signal. A learning paradigm where an agent interacts with an environment, receiving rewards or penalties based on its actions, to learn optimal policies. |
Robust Optimization | Defense techniques that modify the model’s learning process to minimize misclassification of adversarial examples and improve overall robustness. |
Security-Accuracy Trade-off | The trade-off between the model’s accuracy on clean data and its robustness against adversarial attacks. Enhancing one aspect often comes at the expense of the other. |
Semi-Supervised Learning | A learning paradigm that combines labeled and unlabeled data to improve the performance of a model by leveraging the unlabeled data to learn better representations or decision boundaries. |
Supervised Learning | A machine learning approach where the model learns from labeled training data, with inputs and corresponding desired outputs provided during training. |
Transfer Attacks | Adversarial attacks that exploit the transferability of adversarial examples to deceive target models with limited or no direct access. |
Transfer Learning | A technique that leverages knowledge learned from one task to improve performance on a different but related task. |
Transferability | The ability of adversarial examples generated for one model to deceive other similar models. |
Underfitting | A phenomenon where a machine learning model fails to capture the underlying patterns in the training data, resulting in poor performance on both the training and test data. |
Unsupervised Learning | A machine learning approach where the model learns patterns and structures from unlabeled data without explicit output labels. |
White-Box Attacks | Adversarial attacks where the attacker has complete knowledge of the target model, including its architecture, parameters, and internal gradients. |
Want to continue your education in Adversarial Machine Learning? Learn about NetSPI’s AI/ML Penetration Testing.
Explore more blog posts

NetSPI Wins First Place at SHARE Mainframe Capture the Flag Event
Learn how NetSPI's Mainframe Pentesting team claimed first place at SHARE's inaugural Capture the Flag event, showcasing elite z/OS security expertise.

Key Strategies for Tackling External Attack Surface Visibility
Hear from NetSPI Partners on how they tackle external attack surface visibility. These expert insights will help secure assets and boost cyber defense.

CVE-2024-28989: Weak Encryption Key Management in Solar Winds Web Help Desk
Learn how an attacker with access to a backup file could potentially recover certain encrypted passwords.