TH19101
Detected presence of Pickle serialized data that can execute code.
priority | CI/CD status | severity | effort | SAFE level | SAFE assessment |
---|---|---|---|---|---|
fail | high | high | 1 | tampering: fail Reason: unsafe AI models detected |
About the issueโ
An AI (Artificial Intelligence) model is a mathematical representation of a process that uses algorithms to learn patterns and make predictions based on provided data. After the models are trained, their mathematical representations are stored in a variety of data serialization formats. Stored AI models can be shared and reused without the need for additional model training. Pickle is a popular Python module that many data scientists use for serializing and deserializing AI model data. Pickle is considered an unsafe data format, as it allows Python code to be executed during AI model deserialization. Attackers commonly abuse Pickle and other unsafe data serialization formats to hide their malicious payloads. It was detected that the serialized Pickle data includes Python code that can invoke external scripts and execute arbitrary commands on the computer system that attempts to deserialize the AI model data. While presence of Python code within Pickle serialized data does not always imply malicious intent, its use in an AI model should be documented and approved. It is recommended that any custom actions needed to load the AI model be kept separate from the serialized model data.
How to resolve the issueโ
- Investigate reported detections.
- You should delay the software release until the investigation is completed, or until the issue is risk accepted.
- Consider replacing the Pickle data serialization format with a safer alternative.
Incidence statisticsโ
ReversingLabs periodically collects and analyzes the contents of popular software package repositories for threat research purposes. Analysis results are used to calculate incidence statistics for issues (policy violations) that Spectra Assure can detect in software packages.
This section is updated when new data becomes available.
Total amount of packages analyzed
- RubyGems: 183K
- Nuget: 644K
- PyPi: 628K
- NPM: 3.72M
Recommended readingโ
- pickle โ Python object serialization (External resource - Python documentation)
- OWASP Top 10 for LLMs and Generative AI Apps (External resource - OWASP.org)
- Paws in the Pickle Jar: Risk & Vulnerability in the Model-sharing Ecosystem (External resource - Splunk)
- Guidelines for secure AI system development (External resource - UK National Cyber Security Centre (NCSC))