Countering The Underrated Threat Of Data Poisoning Facing Your Organization – Forbes
The utilization of machine learning has skyrocketed over the past few years. The advanced technology has made high-performance computing accessible to almost all businesses out there. Businesses now use machine learning in cybersecurity, social networks, e-commerce websites, search engines, video streaming platforms and more. As organizations and users increasingly rely on machine learning-based applications, security experts have begun warning about adversaries abusing the technology.
Attackers can use data poisoning to severely affect machine learning systems. Machine learning systems are extremely vulnerable to data manipulation. Cybersecurity experts refer to malicious activities by attackers as adversarial machine learning. Adversarial machine learning can be a massive threat to business operations in an organization. Affected machine learning-based applications could produce inaccurate results, affecting business processes drastically. Business leaders need to be mindful of data poisoning on machine learning systems to create proactive strategies to prevent and mitigate such attacks.
Before creating effective strategies to protect machine learning systems, it is essential to understand what data poisoning is and how it can affect businesses. Data poisoning attacks contaminate a machine learning models training data. Such attacks severely impact the machine learning models ability to produce accurate predictions. To achieve this, attackers insert custom-made adversarial data into data sets used to train a machine learning model and the manipulated data is almost undetectable. The length of a data poisoning attack varies based on a models training cycle. In some cases, it may take weeks for a successful data poisoning attack.
Data poisoning attacks can be performed in a black box scenario as well as a white box scenario. In a black box scenario, an attacker uses classifiers in a machine learning model that depend on user feedback to learn. In a white box scenario, an attacker illegally gets access to the model and all the private data from some point in the supply chain, if the data is gathered from many sources.
Data poisoning attacks can allow attackers to get access to confidential information in the training data using corrupted data samples. Attackers can also disguise inputs to trick a machine learning model into evading accurate classification. Along with these, data poisoning attacks enable adversaries to reverse-engineer a machine learning model, assisting them in replicating and analyzing it locally to prepare for more advanced attacks.
Attackers are already targeting big players in the tech industry that use machine learning in cybersecurity with the help of data poisoning. A few years ago, Google had revealed that Gmails spam filter was compromised at least four times, where several spam emails were not marked as spam. Attackers sent millions of emails to throw off the classifier and alter how it defines a spam email. This technique allowed attackers to send several undetected malicious emails containing malware or other cybersecurity threats.
Another example of data poisoning includes Microsofts Twitter chat bot, Tay. Tay was programmed to learn and engage in casual conversation on Twitter. However, cyber criminals fed offensive tweets into Tays algorithm, turning the innocent chat bot offensive. As a result, Microsoft had to shut down Tay just 16 hours after launch.
Preventing and mitigating data poisoning can be extremely tricky. Contaminated data is almost impossible to detect and machine learning models are retrained with data sets at specific intervals depending on their use cases. Since data poisoning is a gradual process that happens over a certain number of training cycles, it is difficult to identify when the accuracy of a machine learning model has begun reducing.
Mitigating the damage done by data poisoning requires a time-consuming process that includes a historical analysis of all inputs for various classifiers to recognize all bad data samples and eliminate them. After this process, an organization would need to begin retraining the machine learning model from a version before the data poisoning attack. However, this entire procedure can be incredibly complicated and expensive when dealing with a large amount of data as well as a large number of data poisoning attacks. As a result, the affected machine learning model may never get fixed.
Considering the time-consuming and complicated process for detecting and mitigating data poisoning, businesses need to develop a proactive approach to protect machine learning models. Business leaders have to focus on vulnerabilities of machine learning in cybersecurity strategies for their organization. Business leaders can consult cybersecurity experts to design strategies that include machine learning in cybersecurity measures of their business.
Countering the Underrated Threat of Data Poisoning Facing Your Organization
Organizations can consider the following techniques to protect machine learning models from data poisoning:
Machine learning engineers and developers have to focus on steps to block attempts at attacking the model and detect polluted data inputs before the next training cycle begins. For this, developers can perform regression testing, input validity checking, manual moderation, anomaly detection and rate limiting. This approach is simpler and more effective compared to fixing compromised models.
Developers can restrict how many inputs can be provided by each unique user for the training data and they can also define the value of each input. A small group of users should not account for the majority of machine learning model training data. Along with these, developers can compare newly trained classifiers to the older ones by rolling them out to a small set of users only.
Attackers need access to a lot of confidential information to execute a successful data poisoning attack. Therefore, organizations should be careful about sharing sensitive data and have strong access control measures in place for the machine learning model as well as data. To do this effectively, business leaders need to design methods to safeguard models of machine learning in cybersecurity strategy that is used across the organization. The protection of machine learning models and data is tied to how an organization generally handles cybersecurity. Businesses can also restrict permissions of several users, enable multi-factor logins, and utilize data and file versioning to keep data sets safer.
Organizations regularly perform penetration tests against their systems and networks to identify vulnerabilities as part of their cybersecurity strategy. They can conduct similar tests on machine learning models to integrate machine learning into cybersecurity measures. Developers need to attack their own machine learning models to understand their vulnerabilities. Based on the insights gained from this technique, they can build defensive strategies to protect training data sets. Such attacks would also help developers identify what poisoned data points look like, allowing them to design mechanisms to discard contaminated data points.
In a recent talk at the USENIX Enigma conference, Hyrum Anderson, Microsofts principal architect of Trustworthy Machine Learning, presented a red team exercise where his team reverse-engineered a machine learning model that was used by a resource provisioning service. Although the team didnt have direct access to the model, they found enough information about how the machine learning model gathered necessary data, and they developed a local model replica to test attacks without being detected by the actual system. This entire process allowed the team to understand how they could attack the live system. After gathering all the essential information, the team managed to execute a successful attack that compromised the live system.
Businesses can perform similar processes to identify weaknesses in their machine learning systems and develop effective security measures. Regularly testing machine learning models will help organizations protect their models against several existing cyber attacks as well as new attacks created by adversaries.
Developers and engineers can occasionally alter machine learning algorithms that use classifiers. These changing algorithms as well as models can be kept secret, and they would be harder to recognize and attack. This is considered as a moving target strategy against attackers, which can help in protecting machine learning models. To effectively execute this strategy, businesses may need to hire more developers and cybersecurity experts to alter machine learning models and test them for vulnerabilities.
Adversarial machine learning may not seem like an immediate threat right now. But as machine learning gets adopted in various industries, it could be a force to reckon with. Data poisoning can prove to be extremely threatening in machine learning-based self-driving cars where human lives can be at risk. Hence, it is essential to start integrating machine learning into cybersecurity workflow to ensure the safety of data sets used in machine learning systems. Currently, there arent any sophisticated tools to protect machine learning models against data poisoning, since cybersecurity experts have started pointing out such threats in recent years. For now, businesses have to rely on creating holistic cybersecurity strategies that focus on the safety of machine learning models. Cybersecurity experts will soon launch far more sophisticated tools that can be deployed to protect machine learning models and data sets.
Continued here:
Countering The Underrated Threat Of Data Poisoning Facing Your Organization - Forbes
- How machine learning and AI can be harnessed for mission-based lending - ImpactAlpha - January 27th, 2025 [January 27th, 2025]
- Machine learning meta-analysis identifies individual characteristics moderating cognitive intervention efficacy for anxiety and depression symptoms -... - January 27th, 2025 [January 27th, 2025]
- Using robotics to introduce AI and machine learning concepts into the elementary classroom - George Mason University - January 27th, 2025 [January 27th, 2025]
- Machine learning to identify environmental drivers of phytoplankton blooms in the Southern Baltic Sea - Nature.com - January 27th, 2025 [January 27th, 2025]
- Why Most Machine Learning Projects Fail to Reach Production and How to Beat the Odds - InfoQ.com - January 27th, 2025 [January 27th, 2025]
- Exploring the intersection of AI and climate physics: Machine learning's role in advancing climate science - Phys.org - January 27th, 2025 [January 27th, 2025]
- 5 Questions with Jonah Berger: Using Artificial Intelligence and Machine Learning in Litigation - Cornerstone Research - January 27th, 2025 [January 27th, 2025]
- Modernizing Patient Support: Harnessing Advanced Automation, Artificial Intelligence and Machine Learning to Improve Efficiency and Performance of... - January 27th, 2025 [January 27th, 2025]
- Param Popat Leads the Way in Transforming Machine Learning Systems - Tech Times - January 27th, 2025 [January 27th, 2025]
- Research on noise-induced hearing loss based on functional and structural MRI using machine learning methods - Nature.com - January 27th, 2025 [January 27th, 2025]
- Machine learning is bringing back an infamous pseudoscience used to fuel racism - ZME Science - January 27th, 2025 [January 27th, 2025]
- How AI and Machine Learning are Redefining Customer Experience Management - Customer Think - January 27th, 2025 [January 27th, 2025]
- Machine Learning Data Catalog Software Market Strategic Insights and Key Innovations: Leading Companies and... - WhaTech - January 27th, 2025 [January 27th, 2025]
- How AI and Machine Learning Will Influence Fintech Frontend Development in 2025 - Benzinga - January 27th, 2025 [January 27th, 2025]
- The Nvidia AI interview: Inside DLSS 4 and machine learning with Bryan Catanzaro - Eurogamer - January 22nd, 2025 [January 22nd, 2025]
- The wide use of machine learning VFX techniques on Here - befores & afters - January 22nd, 2025 [January 22nd, 2025]
- .NET Core: Pioneering the Future of AI and Machine Learning - TechBullion - January 22nd, 2025 [January 22nd, 2025]
- Development and validation of a machine learning-based prediction model for hepatorenal syndrome in liver cirrhosis patients using MIMIC-IV and eICU... - January 22nd, 2025 [January 22nd, 2025]
- A comparative study on different machine learning approaches with periodic items for the forecasting of GPS satellites clock bias - Nature.com - January 22nd, 2025 [January 22nd, 2025]
- Machine learning based prediction models for the prognosis of COVID-19 patients with DKA - Nature.com - January 22nd, 2025 [January 22nd, 2025]
- A scoping review of robustness concepts for machine learning in healthcare - Nature.com - January 22nd, 2025 [January 22nd, 2025]
- How AI and machine learning led to mind blowing progress in understanding animal communication - WHYY - January 22nd, 2025 [January 22nd, 2025]
- 3 Predictions For Predictive AI In 2025 - The Machine Learning Times - January 22nd, 2025 [January 22nd, 2025]
- AI and Machine Learning - WEF report offers practical steps for inclusive AI adoption - SmartCitiesWorld - January 22nd, 2025 [January 22nd, 2025]
- Learnings from a Machine Learning Engineer Part 3: The Evaluation | by David Martin | Jan, 2025 - Towards Data Science - January 22nd, 2025 [January 22nd, 2025]
- Google AI Research Introduces Titans: A New Machine Learning Architecture with Attention and a Meta in-Context Memory that Learns How to Memorize at... - January 22nd, 2025 [January 22nd, 2025]
- Improving BrainMachine Interfaces with Machine Learning ... - eeNews Europe - January 22nd, 2025 [January 22nd, 2025]
- Powered by machine learning, a new blood test can enable early detection of multiple cancers - Medical Xpress - January 15th, 2025 [January 15th, 2025]
- Mapping the Edges of Mass Spectral Prediction: Evaluation of Machine Learning EIMS Prediction for Xeno Amino Acids - Astrobiology News - January 15th, 2025 [January 15th, 2025]
- Development of an interpretable machine learning model based on CT radiomics for the prediction of post acute pancreatitis diabetes mellitus -... - January 15th, 2025 [January 15th, 2025]
- Understanding the spread of agriculture in the Western Mediterranean (6th-3rd millennia BC) with Machine Learning tools - Nature.com - January 15th, 2025 [January 15th, 2025]
- "From 'Food Rules' to Food Reality: Machine Learning Unveils the Ultra-Processed Truth in Our Grocery Carts" - American Council on Science... - January 15th, 2025 [January 15th, 2025]
- AI and Machine Learning in Business Market is Predicted to Reach $190.5 Billion at a CAGR of 32% by 2032 - EIN News - January 15th, 2025 [January 15th, 2025]
- QT Imaging Holdings Introduces Machine Learning-Enabled Image Interpolation Algorithm to Substantially Reduce Scan Time - Business Wire - January 15th, 2025 [January 15th, 2025]
- Global Tiny Machine Learning (TinyML) Market to Reach USD 3.4 Billion by 2030 - Key Drivers and Opportunities | Valuates Reports - PR Newswire UK - January 15th, 2025 [January 15th, 2025]
- Machine learning in mental health getting better all the time - Nature.com - January 15th, 2025 [January 15th, 2025]
- Signature-based intrusion detection using machine learning and deep learning approaches empowered with fuzzy clustering - Nature.com - January 15th, 2025 [January 15th, 2025]
- Machine learning and multi-omics in precision medicine for ME/CFS - Journal of Translational Medicine - January 15th, 2025 [January 15th, 2025]
- Exploring the influence of age on the causes of death in advanced nasopharyngeal carcinoma patients undergoing chemoradiotherapy using machine... - January 15th, 2025 [January 15th, 2025]
- 3D Shape Tokenization - Apple Machine Learning Research - January 9th, 2025 [January 9th, 2025]
- Machine Learning Used To Create Scalable Solution for Single-Cell Analysis - Technology Networks - January 9th, 2025 [January 9th, 2025]
- Robotics: machine learning paves the way for intuitive robots - Hello Future - January 9th, 2025 [January 9th, 2025]
- Machine learning-based estimation of crude oil-nitrogen interfacial tension - Nature.com - January 9th, 2025 [January 9th, 2025]
- Machine learning Nomogram for Predicting endometrial lesions after tamoxifen therapy in breast Cancer patients - Nature.com - January 9th, 2025 [January 9th, 2025]
- Staying ahead of the automation, AI and machine learning curve - Creamer Media's Engineering News - January 9th, 2025 [January 9th, 2025]
- Machine Learning and Quantum Computing Predict Which Antibiotic To Prescribe for UTIs - Consult QD - January 9th, 2025 [January 9th, 2025]
- Machine Learning, Innovation, And The Future Of AI: A Conversation With Manoj Bhoyar - International Business Times UK - January 9th, 2025 [January 9th, 2025]
- AMD's FSR 4 will use machine learning but requires an RDNA 4 GPU, promises 'a dramatic improvement in terms of performance and quality' - PC Gamer - January 9th, 2025 [January 9th, 2025]
- Explainable artificial intelligence with UNet based segmentation and Bayesian machine learning for classification of brain tumors using MRI images -... - January 9th, 2025 [January 9th, 2025]
- Understanding the Fundamentals of AI and Machine Learning - Nairobi Wire - January 9th, 2025 [January 9th, 2025]
- Machine learning can help blood tests have a separate normal for each patient - The Hindu - January 1st, 2025 [January 1st, 2025]
- Artificial Intelligence and Machine Learning Programs Introduced this Spring - The Flash Today - January 1st, 2025 [January 1st, 2025]
- Virtual reality-assisted prediction of adult ADHD based on eye tracking, EEG, actigraphy and behavioral indices: a machine learning analysis of... - January 1st, 2025 [January 1st, 2025]
- Open source machine learning systems are highly vulnerable to security threats - TechRadar - December 22nd, 2024 [December 22nd, 2024]
- After the PS5 Pro's less dramatic changes, PlayStation architect Mark Cerny says the next-gen will focus more on CPUs, memory, and machine-learning -... - December 22nd, 2024 [December 22nd, 2024]
- Accelerating LLM Inference on NVIDIA GPUs with ReDrafter - Apple Machine Learning Research - December 22nd, 2024 [December 22nd, 2024]
- Machine learning for the prediction of mortality in patients with sepsis-associated acute kidney injury: a systematic review and meta-analysis - BMC... - December 22nd, 2024 [December 22nd, 2024]
- Machine learning uncovers three osteosarcoma subtypes for targeted treatment - Medical Xpress - December 22nd, 2024 [December 22nd, 2024]
- From Miniatures to Machine Learning: Crafting the VFX of Alien: Romulus - Animation World Network - December 22nd, 2024 [December 22nd, 2024]
- Identification of hub genes, diagnostic model, and immune infiltration in preeclampsia by integrated bioinformatics analysis and machine learning -... - December 22nd, 2024 [December 22nd, 2024]
- This AI Paper from Microsoft and Novartis Introduces Chimera: A Machine Learning Framework for Accurate and Scalable Retrosynthesis Prediction -... - December 18th, 2024 [December 18th, 2024]
- Benefits and Challenges of Integrating AI and Machine Learning into EHR Systems - Healthcare IT Today - December 18th, 2024 [December 18th, 2024]
- The History Of AI: How Machine Learning's Evolution Is Reshaping Everything Around Us - SlashGear - December 18th, 2024 [December 18th, 2024]
- AI and Machine Learning to Enhance Pension Plan Governance and the Investor Experience: New CFA Institute Research - Fintech Finance - December 18th, 2024 [December 18th, 2024]
- Address Common Machine Learning Challenges With Managed MLflow - The New Stack - December 18th, 2024 [December 18th, 2024]
- Machine Learning Used To Classify Fossils Of Extinct Pollen - Offworld Astrobiology Applications? - Astrobiology News - December 18th, 2024 [December 18th, 2024]
- Machine learning model predicts CDK4/6 inhibitor effectiveness in metastatic breast cancer - News-Medical.Net - December 18th, 2024 [December 18th, 2024]
- New Lockheed Martin Subsidiary to Offer Machine Learning Tools to Defense Customers - ExecutiveBiz - December 18th, 2024 [December 18th, 2024]
- How Powerful Will AI and Machine Learning Become? - International Policy Digest - December 18th, 2024 [December 18th, 2024]
- ChatGPT-Assisted Machine Learning for Chronic Disease Classification and Prediction: A Developmental and Validation Study - Cureus - December 18th, 2024 [December 18th, 2024]
- Blood Tests Are Far From Perfect But Machine Learning Could Change That - Inverse - December 18th, 2024 [December 18th, 2024]
- Amazons AGI boss: You dont need a PhD in machine learning to build with AI anymore - Fortune - December 18th, 2024 [December 18th, 2024]
- From Novice to Pro: A Roadmap for Your Machine Learning Career - KDnuggets - December 10th, 2024 [December 10th, 2024]
- Dimension nabs $500M second fund for 'still contrary' intersection of bio and machine learning - Endpoints News - December 10th, 2024 [December 10th, 2024]
- Using Machine Learning to Make A Really Big Detailed Simulation - Astrobites - December 10th, 2024 [December 10th, 2024]
- Driving Business Growth with GreenTomatos Data and Machine Learning Strategy on Generative AI - AWS Blog - December 10th, 2024 [December 10th, 2024]
- Unlocking the power of data analytics and machine learning to drive business performance - WTW - December 10th, 2024 [December 10th, 2024]
- AI and the Ethics of Machine Learning | by Abwahabanjum | Dec, 2024 - Medium - December 10th, 2024 [December 10th, 2024]
- Differentiating Cystic Lesions in the Sellar Region of the Brain Using Artificial Intelligence and Machine Learning for Early Diagnosis: A Prospective... - December 10th, 2024 [December 10th, 2024]
- New Amazon SageMaker AI Innovations Reimagine How Customers Build and Scale Generative AI and Machine Learning Models - Amazon Press Release - December 10th, 2024 [December 10th, 2024]