Overview

This section explains on a high level the different steps taken by FlowPrint to create fingerprints and compare them to recognize apps or detect unseen apps.

Flow extraction

FlowPrint itself takes as input an array of Flow objects. However, we need to extract these flows from the actual network traffic. Currently, FlowPrint extracts these features from .pcap files using the Preprocessor object. This module provides the function preprocessor.Preprocessor.process() method in which you specify .pcap files and their lables as input and outputs Flow objects and their corresponding labels. The Preprocessor class uses the Reader and Flow classes to produce Flow objects. These Flow objects can be saved and loaded in files using the preprocessor.Preprocessor.save() and preprocessor.Preprocessor.load() methods respectively. Figure 1 gives an overview of the flow extraction process.

../_images/overview_processing.png

Figure 1: Overview flow extraction.

Fingerprint generation

After extracting Flows, FlowPrint generates Fingerprint objects. We refer to our paper for a detailed overview. The code implements this as described in Figure 2. We see that the entire generation process takes place in the FingerprintGenerator object, which uses in order the following classes:

../_images/overview_generation.png

Figure 2: Overview of fingerprint generation.

Fingerprint application

This library implements FlowPrint’s app recognition and unseen app detection applications.

App recognition

To recognize known apps, we simply use FlowPrint’s recognize(X) method. This method creates new Fingerprint objects for the given Flow objects X and compares them to the fingerprints stored using the fit() method. It returns the closest matching fingerprint for each given Flow in X.

Unseen app detection

To detect unseen apps, we simply use FlowPrint’s detect(X, threshold=0.1) method. This method creates new Fingerprint objects for the given Flow objects X and compares them to the fingerprints stored using the fit() method. It returns +1 for each Flow in X that matches a known fingerprint and -1 for each Flow that does not match known fingerprints.