Overview¶

This section explains on a high level the different steps taken by FlowPrint to create fingerprints and compare them to recognize apps or detect unseen apps.

Flow extraction

Fingerprint generation

Fingerprint application

App recognition

Unseen app detection

Flow extraction¶

FlowPrint itself takes as input an array of Flow objects. However, we need to extract these flows from the actual network traffic. Currently, FlowPrint extracts these features from .pcap files using the Preprocessor object. This module provides the function preprocessor.Preprocessor.process() method in which you specify .pcap files and their lables as input and outputs Flow objects and their corresponding labels. The Preprocessor class uses the Reader and Flow classes to produce Flow objects. These Flow objects can be saved and loaded in files using the preprocessor.Preprocessor.save() and preprocessor.Preprocessor.load() methods respectively. Figure 1 gives an overview of the flow extraction process.

Figure 1: Overview flow extraction.

Fingerprint generation¶

After extracting Flows, FlowPrint generates Fingerprint objects. We refer to our paper for a detailed overview. The code implements this as described in Figure 2. We see that the entire generation process takes place in the FingerprintGenerator object, which uses in order the following classes:

Cluster

CrossCorrelationGraph

Fingerprint

Figure 2: Overview of fingerprint generation.

Fingerprint application¶

This library implements FlowPrint’s app recognition and unseen app detection applications.

App recognition¶

To recognize known apps, we simply use FlowPrint’s recognize(X) method. This method creates new Fingerprint objects for the given Flow objects X and compares them to the fingerprints stored using the fit() method. It returns the closest matching fingerprint for each given Flow in X.

Unseen app detection¶

To detect unseen apps, we simply use FlowPrint’s detect(X, threshold=0.1) method. This method creates new Fingerprint objects for the given Flow objects X and compares them to the fingerprints stored using the fit() method. It returns +1 for each Flow in X that matches a known fingerprint and -1 for each Flow that does not match known fingerprints.