Cluster¶
After performing feature extraction, FlowPrint clusters all Flow’s into NetworkDestination according equal (destination IP, destination port)-tuple or TLS certificates.
-
class
cluster.
Cluster
(load=None)[source]¶ Cluster object for clustering flows by network destination
-
samples
¶ Samples used to fit Cluster
Type: np.array of shape=(n_samples,)
-
counter
¶ Counter for total number of NetworkDestinations generated
Type: int
-
dict_destination
¶ Dicationary of (dst IP, dst port) -> NetworkDestination
Type: dict
-
dict_certificate
¶ Dicationary of TLS certificate -> NetworkDestination
Type: dict
-
-
Cluster.
__init__
(load=None)[source]¶ Cluster flows by network destinations
Parameters: load (string, default=None) – If given, load cluster from json file from ‘load’ path.
Generating clusters¶
We can create clusters from Flow’s by fitting the Cluster object cluster.Cluster.fit()
method.
After fitting the cluster, we can use the cluster.Cluster.predict()
method to get all cluster labels as numbers.
The cluster.Cluster.fit_predict()
method combines both methods into a single action.
-
Cluster.
fit
(X, y=None)[source]¶ Fit the clustering algorithm with flow samples X.
Parameters: - X (array-like of shape=(n_samples, n_features)) – Flow samples to fit cluster object.
- y (array-like of shape=(n_samples,), optional) – If given, add labels to each cluster.
Returns: result – Returns self
Return type: self
-
Cluster.
predict
(X)[source]¶ Predict cluster labels of X.
Parameters: X (array-like of shape=(n_samples, n_features)) – Samples for which to predict NetworkDestination cluster. Returns: result – Labels of NetworkDestination cluster corresponding to cluster of fitted samples. Has a value of -1 if no cluster could be matched Return type: array-like of shape=(n_samples,)
-
Cluster.
fit_predict
(X)[source]¶ Fit and predict cluster with given samples.
Parameters: X (array-like of shape=(n_samples, n_features)) – Samples to fit cluster object. Returns: result – Labels of cluster corresponding to cluster of fitted samples. Has a value of -1 if no cluster could be matched. Return type: array-like of shape=(n_samples,)
Cluster views¶
We extract the different NetworkDestination’s generated by the cluster either as a set or as a dictionary of identifier -> NetworkDestination.
I/O methods¶
A cluster can be saved and loaded for further analysis. Additionally you can get a copy of the current Cluster.
-
Cluster.
save
(outfile)[source]¶ Saves cluster object to json file.
Parameters: outfile (string) – Path to json file in which to store the cluster object.
Visualisation¶
To get a visual representation of the generated clusters we offer the cluster.Cluster.plot()
method.