Skip to main content

TH19102

Detected presence of Pickle serialized data that has networking capabilities.

priorityCI/CD statusseverityeffortSAFE levelSAFE assessment
failhighhigh1tampering: fail
Reason: unsafe AI models detected

About the issueโ€‹

An AI (Artificial Intelligence) model is a mathematical representation of a process that uses algorithms to learn patterns and make predictions based on provided data. After the models are trained, their mathematical representations are stored in a variety of data serialization formats. Stored AI models can be shared and reused without the need for additional model training. Pickle is a popular Python module that many data scientists use for serializing and deserializing AI model data. Pickle is considered an unsafe data format, as it allows Python code to be executed during AI model deserialization. Attackers commonly abuse Pickle and other unsafe data serialization formats to hide their malicious payloads. It was detected that the serialized Pickle data includes Python code that can access web resources and make network requests on the computer system that attempts to deserialize the AI model data. While presence of Python code within Pickle serialized data does not always imply malicious intent, its use in an AI model should be documented and approved. It is recommended that any custom actions needed to load the AI model be kept separate from the serialized model data.

How to resolve the issueโ€‹

  • Investigate reported detections.
  • You should delay the software release until the investigation is completed, or until the issue is risk accepted.
  • Consider replacing the Pickle data serialization format with a safer alternative.

Incidence statisticsโ€‹

ReversingLabs periodically collects and analyzes the contents of popular software package repositories for threat research purposes. Analysis results are used to calculate incidence statistics for issues (policy violations) that Spectra Assure can detect in software packages.

This section is updated when new data becomes available.

Total amount of packages analyzed

  • RubyGems: 183K
  • Nuget: 644K
  • PyPi: 628K
  • NPM: 3.72M

Total detections per repository

For every repository, the chart shows the number of packages that triggered the software assurance policy. In other words, it shows how many packages in each package repository were found to have the specific issue described on this page. This information helps you understand how common the issue is across different software communities.

If a repository is absent from the chart, that means none of the packages in that repository triggered this policy during analysis, or the policy was not used during analysis.

Distribution of total detections by project popularity

For every repository, the chart shows how many of the total detections belong to the Top 100 (1-100), Top 1000 (101-1000) and Top 10 000 (1001-10 000) most downloaded projects. This information helps you understand the impact of the issue within each community, making it clearer when the issue affects the most popular projects.

If the chart shows zero values for all of the top project groups, that means all detections were in unranked projects (lower than 10 000 on the list of most downloaded projects).