How Does Asset Merging Work?

Introduction

The host and asset merge process is necessary to avoid creating duplicate assets in a Platform tenant.  Reliable asset and host merging is also essential to both an accurate Asset Database and the effective operation of Dynamic Remediation.

  • Duplicate assets in the Asset Database are most often the result of inappropriate Merge Settings for the types of scans being imported in the Platform

  • Unresolved issues, or issues that should have been automatically remediated by the Platform’s Dynamic Remediation. feature are overlooked, also because of inappropriate Merge Settings and the resulting inaccurate Asset Database.

In this article we’ll explore the Platform’s Asset Merging capability alongside making the best use of the asset merging capability.

Merge Operation

During the issues import process into a Phase, the Platform must determine if each issue's affected host(s) already exist in the tenant's Asset Database.  The platform uses a merge process to achieve this.  

During issue import (whether automatic importing or manual importing), the tenant's Merge Settings (whether defined tenant-wide or for Asset Groups) instruct the Platform which attributes of the issue's host(s) must match with an existing asset in the tenant’s Asset Database.  There are three possible Merge Settings that can be defined:

 

image-20240327-125328.png

 

  • Based on both Hostname and IP Address - the Platform will evaluate each affected host of an imported issue against the Asset Database looking for a match where an Asset shares the same IP address and same Hostname

  • Based on IP Address - the Platform will evaluate each affected host of an imported issue against the Asset Database looking for a match where an Asset shares the same IP address

  • Based on Hostname - the Platform will evaluate each affected host of an imported issue against the Asset Database looking for a match where an Asset shares the same the hostname. NOTE: the hostname must be an exact match. If an imported host is reporting a host and domain name then an asset sharing the same host and domain name must already exist for the Platform to consider a match.

If a match is found, the Platform will update the existing asset with Phase and Issue data from the imported issue(s).  The relevant Phase name will be appended to the asset's Assessment History whilst the newly imported issue(s) affecting the asset will be appended to the asset's Affected Instances.  

If no match is found, the Platform will insert the issue's affected host(s) into the Asset Database as a new asset(s) before adding the relevant Phase and Issue information (as described above).  

Tenant-wide Merge Settings vs Asset-Group Merge Settings

Although Asset Groups are beyond the scope of this article specifically, from an asset merging perspective it is possible to define different Merge Settings for an Asset Group.  That is to say there can be Tenant-wide Merge Settings or Asset Group Merge Settings.

The Merge Settings for an Asset Group (if they differ from tenant-wide Merge Settings), will take priority of the tenant-wide Merge Settings.

There maybe situations where an Asset Group is created that represents a group of dynamic assets where, for example, the IP addresses of these dynamic assets regularly changes and are not predictable/static.  In such situations, it is recommended to have a Merge Setting of Based on Hostname for any Asset Group(s) that contain these types of dynamic assets.  A common example would be where imported scan data is derived from agent-based scans (e.g. Tenable Nessus Agent scans or Qualys Cloud Agent scans).  In this scenario the target host's hostname is usually predictable/consistent across multiple scans whilst the IP address is likely changing between scans (especially for remote ‘off-network’ agents).

Conversely, an Asset Group that contains predictable or static-IP assets will benefit from using either Based on IP address or Based on both Hostname and IP Address.

The Importance of Asset Accuracy & Asset Merging for Dynamic Remediation

A fundamental operation of Dynamic Remediation is the ability to accurately identify and compare hosts between two scans (the latest imported scan and the most recent previous scan). For Dynamic Remediation to successfully auto-remediate issues between two scans, matching hosts is a crucial component; otherwise Dynamic Remediation won’t auto-remediate issues.

Dynamic Remediation will only remediate issues if, when comparing two scans, there is a match on the latest issue’s affected instance/asset (and service/port if available)

As such, for Dynamic Remediation to operate effectively, it is necessary to have a well maintained and accurate Asset Database in the platform. In turn, an accurate Asset Database is only achievable if the host information from imported scan data is merged correctly to avoid duplicate assets. So it is safe to assume that incorrect Merge Settings will have a negative impact on the Asset Database’s accuracy as well as Dynamic Remediation operation.

Consider the following scenario where an incorrect Merge Setting can negatively affect Asset Database accuracy and Dynamic Remediation operation:

  • Assume the default tenant-wide Merge Settings are Based on Hostname and IP Address

  • The first of 12 agent-based monthly scans is performed in January and automatically imported into a new Project in the Platform.

    • The Project utilises Dynamic Remediation without Human Intervention.

    • As this is the first scan, the Asset Merge process will insert each unique host into the Asset Database as a new asset.

    • Additionally, as the first scan, there is no previous scan for Dynamic Remediation to compare so it’s skipped.

  • The second agent-based monthly scan in February is performed against the same targets and the results are automatically imported into the same Project in the Platform.

    • The scan data includes all of the same hosts as January’s scan, but for some of the hosts their IP address has changed (they are dynamic hosts after all)

    • Recall Merge Settings are Based on Hostname and IP Address. This setting dictates that for a host to merge with an existing Asset (and be considered the same Asset), both the Hostname and IP address must match

      • This incorrect Merge Setting now causes a duplicate asset to appear in the Asset Database - after all the Platform didn’t match on the asset (as the IP addresses were different) and instead has inserted a new albeit duplicate host into the Asset Database.

      • This incorrect Merge Setting now causes Dynamic Remediation to skip automatically remediating issues on applicable hosts. Dynamic Remediation doesn’t consider a match between the host in February’s scan and the host in January’s scan.

      • This incorrect Merge Setting means that, when subsequent scan data is imported (i.e. for March’s scan and onwards), issues that should have been remediated automatically between February’s and January’s scan will remain open/published until a user manually reviews and updates

Default Merge Settings = Make No Assumptions

The Platform errs on the side of caution and creates duplicate hosts rather than incorrectly merging hosts and assets together.

There is good reason why a new Platform tenant defaults to a tenant-wide merge setting of Based on both Hostname and IP Address. The Platform errs on the side of caution and creates duplicate hosts rather than incorrectly assume and merge hosts and assets together. In situations where Platform users consider two assets to be the same, it becomes the user’s manual intervention and ultimate decision to manually merge.

The required matching criteria of Based on both Hostname and IP Address means that two key identifiers/attributes of a host must match to be merged. With few exceptions, there is a high degree of confidence that the imported host, when compared to an pre-existing instance of that host, is most likely the same asset if both the hostnames and IP addresses match

General Guidance & Best Practice

Make Use Of Asset Group Merge Settings

When importing scan results for the first time into the Platform, it is recommended that any asset reported by the scanning technology is added to an Asset Group. Since Merge Settings for an Asset Group operate with priority over tenant-wide Merge Settings, use of Asset- Groups can benefit differing scan types where the consistency of IP address and hostname information varies because of the differing methods of host identification and scanning configuration.

For example:

  • Target Hosts with changing/dynamic IP addresses but consistent hostnames (such as Agent-based scans) would benefit from being in a common Asset Group and that group’s Merge Settings to be Based on Hostname

  • Target Hosts will predictable static IP addresses but inconsistent or incomplete DNS hostname information would benefit from being in an Asset Group with the group’s Merge Settings set as Based on IP Address

Ensure Hostname Accuracy in Scan Results

The Platform relies solely on the information provided by imported scan data to identify hostname information. It makes no assumptions. Therefore, if network-based scans are being performed it is important to ensure that hostname information (in the scan results) is reliable and accurate. This largely requires proper configuration of the scanner technology and DNS. Ensuring appropriate DNS coverage and accurate DNS records are in place for the scanned environment(s) will already be necessary for many scanning technologies to reliably identify a target’s hostname. Reliable and accurate hostname data being imported into the Platform will help avoid duplicate assets and ineffective operation of Dynamic Remediation.

When Performing External Scans using Hostnames / FQDNs…

When performing external scans of public-facing targets, if the target is defined by its hostname or fully qualified domain name (FQDN) such as http://www.mywebsite.com , be conscious of how the hostname may resolve to one or more changing IP addresses each time the target host is scanned. This is especially important for targets that are hosted via a multi-homed load-balancer, Reverse Proxy, or Content Delivering Network (CDN).

An example of this is where a host is scanned by its FQDN and the DNS record for that host points to an AWS Elastic Load Balancer (ELB). Each time the host is scanned the FQDN resolves to a different AWS ELB IP address. When importing these scan results into the Platform, and Merge Settings have been defined as Based on IP Address the Platform’s asset merge process may consider a no-match scenario as the IP address for the same target host changed from one scan to the next. Where the scanning technology allows, ensure that scan results are reported primarily by their hostname and choose Based on Hostname as the Merge Setting in the Platform

Periodically Review The Asset Database

Whilst the Platform implements multiple decision-making operations when considering the hosts and assets to merge and, for the majority of use-cases (where Merge Settings are appropriately defined), will provide a robust automated asset management process, there may still be assets that are created in the Asset Database that Platform users consider duplicates. In such situations, users can invoke manual asset merging within the main Assets view in the Platform. Furthermore, the Platform will also ‘be on the look out’ for assets that it considers to be duplicates.

Take for example this view of a tenant’s Assets page, sorting by the “IP Address” column reveals a number of potential duplicate assets. The Platform is also alerting to the fact that 18 potential duplicate assets exist.

image-20240327-144818.png

Using the information, users can make an informed decision on whether merge these three pairs of assets into single assets:

Alternatively, have the Platform step you through the assets it considers to be duplicates:

Tenants Utilising Multiple Companies

When reviewing assets for potential manual merging, also consider which Company each asset is associated with. This is especially important if the tenant has been structured so that multiple Companies exist for the purpose of organising data based on an organisation’s business unit structure or departmental structure.

Two or more assets that appear to be duplicate may be associated with different Companies

An organisation may have 3 Companies defined in their Platform tenant; EMEA, APAC and LATAM as well as various projects and scan data that exist for each Company.

If a user in the EMEA Company is reviewing supposed duplicate assets, it is important to identify whether the supposed assets share the same Company. There may be situations where an asset that was reported in a scan for the EMEA Company also appears in a scan reported for LATAM Company. Whilst on the surface it may appear a duplicate, the same asset(s) may deliberately co-exist in EMEA and in LATAM (perhaps because of overlapping IP subnets in both these regions)

Platform users can enable the “Company” column in the Assets view to check which Company asset(s) are associated with before making a decision on whether to merge these assets or not.

Summary

As detailed in this article, Asset Merging is an important function of the Platform to minimise duplicates and to maximise effectiveness of the Dynamic Remediation feature.

When structuring Projects in the Platform ready to capture imported scan data; consider what targets are being scanned, how they are being scanned and the quality/accuracy of the scan data that is provided to the Platform during import processes. Every organisation is different and adjustments will likely need to be made to merge settings and scanner technologies to align their operations and maximise Platform features.