Countering The Underrated Threat Of Data Poisoning Facing Your Organization – Forbes
The utilization of machine learning has skyrocketed over the past few years. The advanced technology has made high-performance computing accessible to almost all businesses out there. Businesses now use machine learning in cybersecurity, social networks, e-commerce websites, search engines, video streaming platforms and more. As organizations and users increasingly rely on machine learning-based applications, security experts have begun warning about adversaries abusing the technology.
Attackers can use data poisoning to severely affect machine learning systems. Machine learning systems are extremely vulnerable to data manipulation. Cybersecurity experts refer to malicious activities by attackers as adversarial machine learning. Adversarial machine learning can be a massive threat to business operations in an organization. Affected machine learning-based applications could produce inaccurate results, affecting business processes drastically. Business leaders need to be mindful of data poisoning on machine learning systems to create proactive strategies to prevent and mitigate such attacks.
Before creating effective strategies to protect machine learning systems, it is essential to understand what data poisoning is and how it can affect businesses. Data poisoning attacks contaminate a machine learning models training data. Such attacks severely impact the machine learning models ability to produce accurate predictions. To achieve this, attackers insert custom-made adversarial data into data sets used to train a machine learning model and the manipulated data is almost undetectable. The length of a data poisoning attack varies based on a models training cycle. In some cases, it may take weeks for a successful data poisoning attack.
Data poisoning attacks can be performed in a black box scenario as well as a white box scenario. In a black box scenario, an attacker uses classifiers in a machine learning model that depend on user feedback to learn. In a white box scenario, an attacker illegally gets access to the model and all the private data from some point in the supply chain, if the data is gathered from many sources.
Data poisoning attacks can allow attackers to get access to confidential information in the training data using corrupted data samples. Attackers can also disguise inputs to trick a machine learning model into evading accurate classification. Along with these, data poisoning attacks enable adversaries to reverse-engineer a machine learning model, assisting them in replicating and analyzing it locally to prepare for more advanced attacks.
Attackers are already targeting big players in the tech industry that use machine learning in cybersecurity with the help of data poisoning. A few years ago, Google had revealed that Gmails spam filter was compromised at least four times, where several spam emails were not marked as spam. Attackers sent millions of emails to throw off the classifier and alter how it defines a spam email. This technique allowed attackers to send several undetected malicious emails containing malware or other cybersecurity threats.
Another example of data poisoning includes Microsofts Twitter chat bot, Tay. Tay was programmed to learn and engage in casual conversation on Twitter. However, cyber criminals fed offensive tweets into Tays algorithm, turning the innocent chat bot offensive. As a result, Microsoft had to shut down Tay just 16 hours after launch.
Preventing and mitigating data poisoning can be extremely tricky. Contaminated data is almost impossible to detect and machine learning models are retrained with data sets at specific intervals depending on their use cases. Since data poisoning is a gradual process that happens over a certain number of training cycles, it is difficult to identify when the accuracy of a machine learning model has begun reducing.
Mitigating the damage done by data poisoning requires a time-consuming process that includes a historical analysis of all inputs for various classifiers to recognize all bad data samples and eliminate them. After this process, an organization would need to begin retraining the machine learning model from a version before the data poisoning attack. However, this entire procedure can be incredibly complicated and expensive when dealing with a large amount of data as well as a large number of data poisoning attacks. As a result, the affected machine learning model may never get fixed.
Considering the time-consuming and complicated process for detecting and mitigating data poisoning, businesses need to develop a proactive approach to protect machine learning models. Business leaders have to focus on vulnerabilities of machine learning in cybersecurity strategies for their organization. Business leaders can consult cybersecurity experts to design strategies that include machine learning in cybersecurity measures of their business.
Countering the Underrated Threat of Data Poisoning Facing Your Organization
Organizations can consider the following techniques to protect machine learning models from data poisoning:
Machine learning engineers and developers have to focus on steps to block attempts at attacking the model and detect polluted data inputs before the next training cycle begins. For this, developers can perform regression testing, input validity checking, manual moderation, anomaly detection and rate limiting. This approach is simpler and more effective compared to fixing compromised models.
Developers can restrict how many inputs can be provided by each unique user for the training data and they can also define the value of each input. A small group of users should not account for the majority of machine learning model training data. Along with these, developers can compare newly trained classifiers to the older ones by rolling them out to a small set of users only.
Attackers need access to a lot of confidential information to execute a successful data poisoning attack. Therefore, organizations should be careful about sharing sensitive data and have strong access control measures in place for the machine learning model as well as data. To do this effectively, business leaders need to design methods to safeguard models of machine learning in cybersecurity strategy that is used across the organization. The protection of machine learning models and data is tied to how an organization generally handles cybersecurity. Businesses can also restrict permissions of several users, enable multi-factor logins, and utilize data and file versioning to keep data sets safer.
Organizations regularly perform penetration tests against their systems and networks to identify vulnerabilities as part of their cybersecurity strategy. They can conduct similar tests on machine learning models to integrate machine learning into cybersecurity measures. Developers need to attack their own machine learning models to understand their vulnerabilities. Based on the insights gained from this technique, they can build defensive strategies to protect training data sets. Such attacks would also help developers identify what poisoned data points look like, allowing them to design mechanisms to discard contaminated data points.
In a recent talk at the USENIX Enigma conference, Hyrum Anderson, Microsofts principal architect of Trustworthy Machine Learning, presented a red team exercise where his team reverse-engineered a machine learning model that was used by a resource provisioning service. Although the team didnt have direct access to the model, they found enough information about how the machine learning model gathered necessary data, and they developed a local model replica to test attacks without being detected by the actual system. This entire process allowed the team to understand how they could attack the live system. After gathering all the essential information, the team managed to execute a successful attack that compromised the live system.
Businesses can perform similar processes to identify weaknesses in their machine learning systems and develop effective security measures. Regularly testing machine learning models will help organizations protect their models against several existing cyber attacks as well as new attacks created by adversaries.
Developers and engineers can occasionally alter machine learning algorithms that use classifiers. These changing algorithms as well as models can be kept secret, and they would be harder to recognize and attack. This is considered as a moving target strategy against attackers, which can help in protecting machine learning models. To effectively execute this strategy, businesses may need to hire more developers and cybersecurity experts to alter machine learning models and test them for vulnerabilities.
Adversarial machine learning may not seem like an immediate threat right now. But as machine learning gets adopted in various industries, it could be a force to reckon with. Data poisoning can prove to be extremely threatening in machine learning-based self-driving cars where human lives can be at risk. Hence, it is essential to start integrating machine learning into cybersecurity workflow to ensure the safety of data sets used in machine learning systems. Currently, there arent any sophisticated tools to protect machine learning models against data poisoning, since cybersecurity experts have started pointing out such threats in recent years. For now, businesses have to rely on creating holistic cybersecurity strategies that focus on the safety of machine learning models. Cybersecurity experts will soon launch far more sophisticated tools that can be deployed to protect machine learning models and data sets.
Continued here:
Countering The Underrated Threat Of Data Poisoning Facing Your Organization - Forbes
- Infleqtion Unveils Contextual Machine Learning (CML) at GTC 2025, Powering AI Breakthroughs with NVIDIA CUDA-Q and Quantum-Inspired Algorithms - Yahoo... - March 22nd, 2025 [March 22nd, 2025]
- Karlie Kloss' coding nonprofit offering free AI and machine learning workshop this weekend - KSDK.com - March 22nd, 2025 [March 22nd, 2025]
- Machine learning reveals distinct neuroanatomical signatures of cardiovascular and metabolic diseases in cognitively unimpaired individuals -... - March 22nd, 2025 [March 22nd, 2025]
- Machine learning analysis of cardiovascular risk factors and their associations with hearing loss - Nature.com - March 22nd, 2025 [March 22nd, 2025]
- Weekly Recap: Dual-Cure Inks, AI And Machine Learning Top This Weeks Stories - Ink World Magazine - March 22nd, 2025 [March 22nd, 2025]
- Network-based predictive models for artificial intelligence: an interpretable application of machine learning techniques in the assessment of... - March 22nd, 2025 [March 22nd, 2025]
- Machine learning aids in detection of 'brain tsunamis' - University of Cincinnati - March 22nd, 2025 [March 22nd, 2025]
- AI & Machine Learning in Database Management: Studying Trends and Applications with Nithin Gadicharla - Tech Times - March 22nd, 2025 [March 22nd, 2025]
- MicroRNA Biomarkers and Machine Learning for Hypertension Subtyping - Physician's Weekly - March 22nd, 2025 [March 22nd, 2025]
- Machine Learning Pioneer Ramin Hasani Joins Info-Tech's "Digital Disruption" Podcast to Explore the Future of AI and Liquid Neural Networks... - March 22nd, 2025 [March 22nd, 2025]
- Predicting HIV treatment nonadherence in adolescents with machine learning - News-Medical.Net - March 22nd, 2025 [March 22nd, 2025]
- AI And Machine Learning In Ink And Coatings Formulation - Ink World Magazine - March 22nd, 2025 [March 22nd, 2025]
- Counting whales by eavesdropping on their chatter, with help from machine learning - Mongabay.com - March 22nd, 2025 [March 22nd, 2025]
- Associate Professor - Artificial Intelligence and Machine Learning job with GALGOTIAS UNIVERSITY | 390348 - Times Higher Education - March 22nd, 2025 [March 22nd, 2025]
- Innovative Machine Learning Tool Reveals Secrets Of Marine Microbial Proteins - Evrim Aac - March 22nd, 2025 [March 22nd, 2025]
- Exploring the role of breastfeeding, antibiotics, and indoor environments in preschool children atopic dermatitis through machine learning and hygiene... - March 22nd, 2025 [March 22nd, 2025]
- Applying machine learning algorithms to explore the impact of combined noise and dust on hearing loss in occupationally exposed populations -... - March 22nd, 2025 [March 22nd, 2025]
- 'We want them to be the creators': Karlie Kloss' coding nonprofit offering free AI and machine learning workshop this weekend - KSDK.com - March 22nd, 2025 [March 22nd, 2025]
- New headset reads minds and uses AR, AI and machine learning to help people with locked-in-syndrome communicate with loved ones again - PC Gamer - March 22nd, 2025 [March 22nd, 2025]
- Enhancing cybersecurity through script development using machine and deep learning for advanced threat mitigation - Nature.com - March 11th, 2025 [March 11th, 2025]
- Machine learning-assisted wearable sensing systems for speech recognition and interaction - Nature.com - March 11th, 2025 [March 11th, 2025]
- Machine learning uncovers complexity of immunotherapy variables in bladder cancer - Hospital Healthcare - March 11th, 2025 [March 11th, 2025]
- Machine-learning algorithm analyzes gravitational waves from merging neutron stars in the blink of an eye - The University of Rhode Island - March 11th, 2025 [March 11th, 2025]
- Precision soil sampling strategy for the delineation of management zones in olive cultivation using unsupervised machine learning methods - Nature.com - March 11th, 2025 [March 11th, 2025]
- AI in Esports: How Machine Learning is Transforming Anti-Cheat Systems in Esports - Jumpstart Media - March 11th, 2025 [March 11th, 2025]
- Whats that microplastic? Advances in machine learning are making identifying plastics in the environment more reliable - The Conversation Indonesia - March 11th, 2025 [March 11th, 2025]
- Application of machine learning techniques in GlaucomAI system for glaucoma diagnosis and collaborative research support - Nature.com - March 11th, 2025 [March 11th, 2025]
- Elucidating the role of KCTD10 in coronary atherosclerosis: Harnessing bioinformatics and machine learning to advance understanding - Nature.com - March 11th, 2025 [March 11th, 2025]
- Hugging Face Tutorial: Unleashing the Power of AI and Machine Learning - - March 11th, 2025 [March 11th, 2025]
- Utilizing Machine Learning to Predict Host Stars and the Key Elemental Abundances of Small Planets - Astrobiology News - March 11th, 2025 [March 11th, 2025]
- AI to the rescue: Study shows machine learning predicts long term recovery for anxiety with 72% accuracy - Hindustan Times - March 11th, 2025 [March 11th, 2025]
- New in 2025.3: Reducing false positives with Machine Learning - Emsisoft - March 5th, 2025 [March 5th, 2025]
- Abnormal FX Returns And Liquidity-Based Machine Learning Approaches - Seeking Alpha - March 5th, 2025 [March 5th, 2025]
- Sentiment analysis of emoji fused reviews using machine learning and Bert - Nature.com - March 5th, 2025 [March 5th, 2025]
- Detection of obstetric anal sphincter injuries using machine learning-assisted impedance spectroscopy: a prospective, comparative, multicentre... - March 5th, 2025 [March 5th, 2025]
- JFrog and Hugging Face team to improve machine learning security and transparency for developers - SDxCentral - March 5th, 2025 [March 5th, 2025]
- Opportunistic access control scheme for enhancing IoT-enabled healthcare security using blockchain and machine learning - Nature.com - March 5th, 2025 [March 5th, 2025]
- AI and Machine Learning Operationalization Software Market Hits New High | Major Giants Google, IBM, Microsoft - openPR - March 5th, 2025 [March 5th, 2025]
- FICO secures new patents in AI and machine learning technologies - Investing.com - March 5th, 2025 [March 5th, 2025]
- Study on landslide hazard risk in Wenzhou based on slope units and machine learning approaches - Nature.com - March 5th, 2025 [March 5th, 2025]
- NVIDIA Is Finding Great Success With Vulkan Machine Learning - Competitive With CUDA - Phoronix - March 3rd, 2025 [March 3rd, 2025]
- MRI radiomics based on machine learning in high-grade gliomas as a promising tool for prediction of CD44 expression and overall survival - Nature.com - March 3rd, 2025 [March 3rd, 2025]
- AI and Machine Learning - Identifying meaningful use cases to fulfil the promise of AI in cities - SmartCitiesWorld - March 3rd, 2025 [March 3rd, 2025]
- Prediction of contrast-associated acute kidney injury with machine-learning in patients undergoing contrast-enhanced computed tomography in emergency... - March 3rd, 2025 [March 3rd, 2025]
- Predicting Ag Harvest using ArcGIS and Machine Learning - Esri - March 1st, 2025 [March 1st, 2025]
- Seeing Through The Hype: The Difference Between AI And Machine Learning In Marketing - AdExchanger - March 1st, 2025 [March 1st, 2025]
- Machine Learning Meets War Termination: Using AI to Explore Peace Scenarios in Ukraine - Center for Strategic & International Studies - March 1st, 2025 [March 1st, 2025]
- Statistical and machine learning analysis of diesel engines fueled with Moringa oleifera biodiesel doped with 1-hexanol and Zr2O3 nanoparticles |... - March 1st, 2025 [March 1st, 2025]
- Spatial analysis of air pollutant exposure and its association with metabolic diseases using machine learning - BMC Public Health - March 1st, 2025 [March 1st, 2025]
- The Evolution of AI in Software Testing: From Machine Learning to Agentic AI - CSRwire.com - March 1st, 2025 [March 1st, 2025]
- Wonder Dynamics Helps Boxel Studio Embrace Machine Learning and AI - Animation World Network - March 1st, 2025 [March 1st, 2025]
- Predicting responsiveness to fixed-dose methylene blue in adult patients with septic shock using interpretable machine learning: a retrospective study... - March 1st, 2025 [March 1st, 2025]
- Workplace Predictions: AI, Machine Learning To Transform Operations In 2025 - Facility Executive Magazine - March 1st, 2025 [March 1st, 2025]
- Development and validation of a machine learning approach for screening new leprosy cases based on the leprosy suspicion questionnaire - Nature.com - March 1st, 2025 [March 1st, 2025]
- Machine learning analysis of gene expression profiles of pyroptosis-related differentially expressed genes in ischemic stroke revealed potential... - March 1st, 2025 [March 1st, 2025]
- Utilization of tree-based machine learning models for predicting low birth weight cases - BMC Pregnancy and Childbirth - March 1st, 2025 [March 1st, 2025]
- Machine learning-based pattern recognition of Bender element signals for predicting sand particle-size - Nature.com - March 1st, 2025 [March 1st, 2025]
- Wearable Tech Uses Machine Learning to Predict Mood Swings - IoT World Today - March 1st, 2025 [March 1st, 2025]
- Machine learning can prevent thermal runaway in EV batteries - Automotive World - March 1st, 2025 [March 1st, 2025]
- Integration of multiple machine learning approaches develops a gene mutation-based classifier for accurate immunotherapy outcomes - Nature.com - March 1st, 2025 [March 1st, 2025]
- Data Analytics Market Size to Surpass USD 483.41 Billion by 2032 Owing to Rising Adoption of AI & Machine Learning Technologies - Yahoo Finance - March 1st, 2025 [March 1st, 2025]
- Predictive AI Only Works If Stakeholders Tune This Dial - The Machine Learning Times - March 1st, 2025 [March 1st, 2025]
- Relationship between atherogenic index of plasma and length of stay in critically ill patients with atherosclerotic cardiovascular disease: a... - March 1st, 2025 [March 1st, 2025]
- A global survey from SAS shows that artificial intelligence and machine learning are producing major benefits in combating money laundering and other... - March 1st, 2025 [March 1st, 2025]
- Putting the AI in air cargo: How machine learning is reshaping demand forecasting - Air Cargo Week - March 1st, 2025 [March 1st, 2025]
- Meta speeds up its hiring process for machine-learning engineers as it cuts thousands of 'low performers' - Business Insider - February 11th, 2025 [February 11th, 2025]
- AI vs. Machine Learning: The Key Differences and Why They Matter - Lifewire - February 11th, 2025 [February 11th, 2025]
- Unravelling single-cell DNA replication timing dynamics using machine learning reveals heterogeneity in cancer progression - Nature.com - February 11th, 2025 [February 11th, 2025]
- Climate change and machine learning the good, bad, and unknown - MIT Sloan News - February 11th, 2025 [February 11th, 2025]
- Theory, Analysis, and Best Practices for Sigmoid Self-Attention - Apple Machine Learning Research - February 11th, 2025 [February 11th, 2025]
- Yielding insights: Machine learning driven imputations to fill in agricultural data gaps in surveys - World Bank - February 11th, 2025 [February 11th, 2025]
- SKUtrak Promote tool taps machine learning powered analysis to shake up way brands run promotions - Retail Technology Innovation Hub - February 11th, 2025 [February 11th, 2025]
- Machine learning approaches for resilient modulus modeling of cement-stabilized magnetite and hematite iron ore tailings - Nature.com - February 11th, 2025 [February 11th, 2025]
- The Alignment Problem: Machine Learning and Human Values - Harvard Gazette - February 11th, 2025 [February 11th, 2025]
- Narrowing the gap between machine learning scoring functions and free energy perturbation using augmented data - Nature.com - February 11th, 2025 [February 11th, 2025]
- Analyzing the influence of manufactured sand and fly ash on concrete strength through experimental and machine learning methods - Nature.com - February 11th, 2025 [February 11th, 2025]
- Machine learning prediction of glaucoma by heavy metal exposure: results from the National Health and Nutrition Examination Survey 2005 to 2008 -... - February 11th, 2025 [February 11th, 2025]
- Correlation of rivaroxaban solubility in mixed solvents for optimization of solubility using machine learning analysis and validation - Nature.com - February 11th, 2025 [February 11th, 2025]
- Characterisation of cardiovascular disease (CVD) incidence and machine learning risk prediction in middle-aged and elderly populations: data from the... - February 11th, 2025 [February 11th, 2025]
- Unlock the Secrets of AI: How Mohit Pandey Makes Machine Learning Fun! - Mi Valle - February 11th, 2025 [February 11th, 2025]