TensorClus reader

The TensorClus.reader module provides functions to load and read different data format.

TensorClus.reader.load.load_dataset(datasetName)[source]

Load one of the available dataset.

datasetNamestr: the name of dataset

tensor: three-way numpy array
labels: true row classes (ground-truth)
slices: slices name

TensorClus.reader.load.read_txt_tensor(filePath)[source]

read tensor data from text file.

filePathstr: the path of file

tensor: three-way numpy array

TensorClus.reader.load.save_txt_tensor(tensor, filePath)[source]

save tensor data as a text file.

tensor : tensor array

filePathstr: the path of file

TensorClus decomposition

The TensorClus.decomposition.decomposition_with_clustering module provides a class with common methods for multiple clustering alorihtm from decomposition results.

class TensorClus.decomposition.decomposition_with_clustering.DecompositionWithClustering(n_clusters=[2, 2, 2], modes=[1, 2, 3], algorithm='Kmeans++')[source]

Clustering from decomposition results.

n_clustersarray-like, optional, default: [2,2,2]: Number of row clusters to form
modesarray-like, optional, default: [1,2,3]: Selected modes for clustering
algorithmstring, optional, default: “kmeans++”: Selected algorithm for clustering

labels_array-like, shape (n_rows,): clustering label of each row

fit(X, y=None)[source]

Perform Tensor co-clustering.

X : decomposition results

TensorClus coclustering

The TensorClus.coclustering.sparseTensorCoclustering module provides an implementation of a Sparse tensor co-clustering algorithm.

class TensorClus.coclustering.sparseTensorCoclustering.SparseTensorCoclusteringPoisson(n_clusters=2, fuzzy=True, init_row=None, init_col=None, max_iter=50, n_init=1, tol=1e-06, random_state=None, gpu=None)[source]

Tensor Latent Block Model for Poisson distribution.

n_row_clustersint, optional, default: 2: Number of row clusters to form
n_col_clustersint, optional, default: 2: Number of column clusters to form
fuzzyboolean, optional, default: True: Provide fuzzy clustering, If fuzzy is False a hard clustering is performed
init_rownumpy array or scipy sparse matrix, shape (n_rows, K), optional, default: None: Initial row labels
init_colnumpy array or scipy sparse matrix, shape (n_cols, L), optional, default: None: Initial column labels
max_iterint, optional, default: 20: Maximum number of iterations
n_initint, optional, default: 1: Number of time the algorithm will be run with different initializations. The final results will be the best output of n_init consecutive runs.
random_stateinteger or numpy.RandomState, optional: The generator used to initialize the centers. If an integer is given, it fixes the seed. Defaults to the global numpy random number generator.
tolfloat, default: 1e-9: Relative tolerance with regards to criterion to declare convergence

row_labels_array-like, shape (n_rows,): Bicluster label of each row
column_labels_array-like, shape (n_cols,): Bicluster label of each column
gamma_klarray-like, shape (k,l,v): Value \(\frac{p_{kl}}{p_{k.} \times p_{.l}}\) for each row cluster k and column cluster l
gamma_kl_evolutionarray-like, shape(k,l,max_iter): Value of gamma_kl of each bicluster according to iterations

F_c(x, z, w, gammakl, pi_k, rho_l, choice='ZW')[source]

Compute fuzzy log-likelihood (LL) criterion.

Xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features): Tensor to be analyzed
znumpy array, shape= (n_row_objects, K): matrix of row partition
wnumpy array, shape(d_col_objects, L): matrix of column partition
gammaklthree-way numpy array, shape=(K,L, v_features): matrix of bloc’s parameters
pi_knumpy array, shape(K,): vector of row cluster proportion
rho_lnumpy array, shape(K,): vector of column cluster proportion
choicestring, take values in (“Z”, “W”, “ZW”): considering the optimization of LL

(H_z, H_w, LL, value): (row entropy, column entropy, Log-likelihood, lower bound of log-likelihood)

fit(X, y=None)[source]

Perform Tensor co-clustering.

Xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features): Tensor to be analyzed

gammakl(x, z, w)[source]

Perform Tensor co-clustering.

xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features): Tensor to be analyzed

z : row partition w : column partition Returns ——- gamma_kl_mat

three-way numpy array, shape=(K,L, v_features) Computed parameters per block

pi_k(z)[source]

Compute row proportion.

znumpy array, shape= (n_row_objects, K): matrix of row partition

pi_k_vect: numpy array, shape=(K) proportion of row clusters

rho_l(w)[source]

Compute column proportion.

wnumpy array, shape(d_col_objects, L): matrix of column partition

rho_l_vect: numpy array, shape=(L) proportion of column clusters

The TensorClus.coclustering.tensorCoclusteringPoisson module provides an implementation of a tensor co-clustering algorithm for count three-way tensor.

class TensorClus.coclustering.tensorCoclusteringPoisson.TensorCoclusteringPoisson(n_row_clusters=2, n_col_clusters=2, fuzzy=True, init_row=None, init_col=None, max_iter=50, n_init=1, tol=1e-06, random_state=None, gpu=None)[source]

Tensor Latent Block Model for Poisson distribution.

n_row_clustersint, optional, default: 2: Number of row clusters to form
n_col_clustersint, optional, default: 2: Number of column clusters to form
fuzzyboolean, optional, default: True: Provide fuzzy clustering, If fuzzy is False a hard clustering is performed
init_rownumpy array or scipy sparse matrix, shape (n_rows, K), optional, default: None: Initial row labels
init_colnumpy array or scipy sparse matrix, shape (n_cols, L), optional, default: None: Initial column labels
max_iterint, optional, default: 20: Maximum number of iterations
n_initint, optional, default: 1: Number of time the algorithm will be run with different initializations. The final results will be the best output of n_init consecutive runs.
random_stateinteger or numpy.RandomState, optional: The generator used to initialize the centers. If an integer is given, it fixes the seed. Defaults to the global numpy random number generator.
tolfloat, default: 1e-9: Relative tolerance with regards to criterion to declare convergence

row_labels_array-like, shape (n_rows,): Bicluster label of each row
column_labels_array-like, shape (n_cols,): Bicluster label of each column
gamma_klarray-like, shape (k,l,v): Value \(\frac{p_{kl}}{p_{k.} \times p_{.l}}\) for each row cluster k and column cluster l
gamma_kl_evolutionarray-like, shape(k,l,max_iter): Value of gamma_kl of each bicluster according to iterations

F_c(x, z, w, gammakl, pi_k, rho_l, choice='ZW')[source]

Compute fuzzy log-likelihood (LL) criterion.

Xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features): Tensor to be analyzed
znumpy array, shape= (n_row_objects, K): matrix of row partition
wnumpy array, shape(d_col_objects, L): matrix of column partition
gammaklthree-way numpy array, shape=(K,L, v_features): matrix of bloc’s parameters
pi_knumpy array, shape(K,): vector of row cluster proportion
rho_lnumpy array, shape(K,): vector of column cluster proportion
choicestring, take values in (“Z”, “W”, “ZW”): considering the optimization of LL

(H_z, H_w, LL, value): (row entropy, column entropy, Log-likelihood, lower bound of log-likelihood)

fit(X, y=None)[source]

Perform Tensor co-clustering.

Xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features): Tensor to be analyzed

gammakl(x, z, w)[source]

Compute gamma_kl per bloc.

Xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features): Tensor to be analyzed
znumpy array, shape= (n_row_objects, K): matrix of row partition
wnumpy array, shape(d_col_objects, L): matrix of column partition

gamma_kl_mat: three-way numpy array, shape=(K,L, v_features) Computed parameters per block

pi_k(z)[source]

Compute row proportion.

znumpy array, shape= (n_row_objects, K): matrix of row partition

pi_k_vect: numpy array, shape=(K) proportion of row clusters

rho_l(w)[source]

Compute column proportion. Parameters ———- w : numpy array, shape(d_col_objects, L)

matrix of column partition

rho_l_vect: numpy array, shape=(L) proportion of column clusters

The TensorClus.coclustering.tensorCoclusteringGaussian module provides an implementation of a tensor co-clustering algorithm for continous three-way tensor.

class TensorClus.coclustering.tensorCoclusteringGaussian.TensorCoclusteringGaussian(n_row_clusters=2, n_col_clusters=2, fuzzy=True, parsimonious=True, init_row=None, init_col=None, max_iter=50, n_init=1, tol=1e-06, random_state=None, gpu=None)[source]

Tensor Latent Block Model for Normal distribution.

n_row_clustersint, optional, default: 2: Number of row clusters to form
n_col_clustersint, optional, default: 2: Number of column clusters to form
fuzzyboolean, optional, default: True: Provide fuzzy clustering, If fuzzy is False a hard clustering is performed
parsimoniousboolean, optional, default: True: Provide parsimonious model, If parsimonious False sigma is computed at each iteration
init_rownumpy array or scipy sparse matrix, shape (n_rows, K), optional, default: None: Initial row labels
init_colnumpy array or scipy sparse matrix, shape (n_cols, L), optional, default: None: Initial column labels
max_iterint, optional, default: 20: Maximum number of iterations
n_initint, optional, default: 1: Number of time the algorithm will be run with different initializations. The final results will be the best output of n_init consecutive runs.
random_stateinteger or numpy.RandomState, optional: The generator used to initialize the centers. If an integer is given, it fixes the seed. Defaults to the global numpy random number generator.
tolfloat, default: 1e-9: Relative tolerance with regards to criterion to declare convergence

row_labels_array-like, shape (n_rows,): Bicluster label of each row
column_labels_array-like, shape (n_cols,): Bicluster label of each column
mu_klarray-like, shape (k,l,v): Value :math: mean vector for each row cluster k and column cluster l
sigma_kl_array-like, shape (k,l,v,v): Value of covariance matrix for each row cluster k and column cluster

F_c(x, z, w, mukl, sigma_x_kl, pi_k, rho_l, choice='ZW')[source]

Compute fuzzy log-likelihood (LL) criterion.

Xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features): Tensor to be analyzed
znumpy array, shape= (n_row_objects, K): matrix of row partition
wnumpy array, shape(d_col_objects, L): matrix of column partition
muklthree-way numpy array, shape=(K,L, v_features): matrix of mean parameter pe bloc
sigma_x_klFour-way numpy array, shape=(K,L,v_features, v_features): tensor of sigma matrices for all blocks
pi_knumpy array, shape(K,): vector of row cluster proportion
rho_lnumpy array, shape(K,): vector of column cluster proportion
choicestring, take values in (“Z”, “W”, “ZW”): considering the optimization of LL

(H_z, H_w, LL, value): (row entropy, column entropy, Log-likelihood, lower bound of log-likelihood)

fit(X, y=None)[source]

Perform Tensor co-clustering.

Xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features): Tensor to be analyzed

mukl(x, z, w)[source]

Compute the mean vector mu_kl per bloc.

Xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features): Tensor to be analyzed
znumpy array, shape= (n_row_objects, K): matrix of row partition
wnumpy array, shape(d_col_objects, L): matrix of column partition

mukl_mat: three-way numpy array, shape=(K,L, v_features) Computed parameters per block

pi_k(z)[source]

Compute row proportion.

znumpy array, shape= (n_row_objects, K): matrix of row partition

pi_k_vect: numpy array, shape=(K) proportion of row clusters

rho_l(w)[source]

Compute column proportion.

wnumpy array, shape(d_col_objects, L): matrix of column partition

rho_l_vect: numpy array, shape=(L) proportion of column clusters

sigma_x_kl(x, z, w, mukl)[source]

Compute the mean vector sigma_kl per bloc.

Xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features): Tensor to be analyzed
znumpy array, shape= (n_row_objects, K): matrix of row partition
wnumpy array, shape(d_col_objects, L): matrix of column partition
muklnumpy array, shape(K,L, v_features): tensor of mukl values

sigma_x_kl_mat: three-way numpy array Computed the covariance parameters per block

The TensorClus.coclustering.tensorCoclusteringBernoulli module provides an implementation of a tensor co-clustering algorithm for binary three-way tensor.

class TensorClus.coclustering.tensorCoclusteringBernoulli.TensorCoclusteringBernoulli(n_row_clusters=2, n_col_clusters=2, fuzzy=False, init_row=None, init_col=None, max_iter=50, n_init=1, tol=1e-06, random_state=None, gpu=None)[source]

Tensor Latent Block Model for Bernoulli distribution.

n_row_clustersint, optional, default: 2: Number of row clusters to form
n_col_clustersint, optional, default: 2: Number of column clusters to form
fuzzyboolean, optional, default: True: Provide fuzzy clustering, If fuzzy is False a hard clustering is performed
init_rownumpy array or scipy sparse matrix, shape (n_rows, K), optional, default: None: Initial row labels
init_colnumpy array or scipy sparse matrix, shape (n_cols, L), optional, default: None: Initial column labels
max_iterint, optional, default: 20: Maximum number of iterations
n_initint, optional, default: 1: Number of time the algorithm will be run with different initializations. The final results will be the best output of n_init consecutive runs.
random_stateinteger or numpy.RandomState, optional: The generator used to initialize the centers. If an integer is given, it fixes the seed. Defaults to the global numpy random number generator.
tolfloat, default: 1e-9: Relative tolerance with regards to criterion to declare convergence

row_labels_array-like, shape (n_rows,): Bicluster label of each row
column_labels_array-like, shape (n_cols,): Bicluster label of each column
mu_klarray-like, shape (k,l,v): Value :math: mean vector for each row cluster k and column cluster l

F_c(x, z, w, mukl, pi_k, rho_l, choice='ZW')[source]

Compute fuzzy log-likelihood (LL) criterion.

Xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features): Tensor to be analyzed
znumpy array, shape= (n_row_objects, K): matrix of row partition
wnumpy array, shape(d_col_objects, L): matrix of column partition
muklthree-way numpy array, shape=(K,L, v_features): matrix of mean parameter pe bloc
pi_knumpy array, shape(K,): vector of row cluster proportion
rho_lnumpy array, shape(K,): vector of column cluster proportion
choicestring, take values in (“Z”, “W”, “ZW”): considering the optimization of LL

(H_z, H_w, LL, value): (row entropy, column entropy, Log-likelihood, lower bound of log-likelihood)

fit(X, y=None)[source]

Perform Tensor co-clustering.

Xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features): Tensor to be analyzed

mukl(x, z, w)[source]

Compute the mean vector mu_kl per bloc.

Xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features): Tensor to be analyzed
znumpy array, shape= (n_row_objects, K): matrix of row partition
wnumpy array, shape(d_col_objects, L): matrix of column partition

mukl_mat: three-way numpy array

pi_k(z)[source]

Compute row proportion.

znumpy array, shape= (n_row_objects, K): matrix of row partition

pi_k_vect: numpy array, shape=(K) proportion of row clusters

rho_l(w)[source]

Compute column proportion.

wnumpy array, shape(d_col_objects, L): matrix of column partition

rho_l_vect: numpy array, shape=(L) proportion of column clusters

TensorClus vizualisation

The TensorClus.vizualisation module provides functions to visualize different measures or data.

TensorClus.vizualisation.__init__.Plot_CoClust_axes_etiquette(title, fig, axes, data, phiR, phiC, K, L, etiquette)[source]

Plot CoClustering results for each slice on specific axes.

title: title of figure

fig : figure that includes all axes

axes : list of axes corresponding to the number of slices

data : tensor data

phiR : row clustering partition

phiC : row clustering partition

K : number of row cluster

L : number of columns cluster

etiquette : name of slices

TensorClus.vizualisation.__init__.duplicates(lst, item)[source]

Find index of duplicated values.

lst: list of values item: values to determine

list: index of dipulicated values

TensorClus.vizualisation.__init__.generateColour()[source]

Generate random color.

str: hex color

TensorClus.vizualisation.__init__.plot_logLikelihood_evolution(model, do_plot=True, save=False, dpi=200)[source]

Plot all intermediate loglikelihood for a model at each iteration.

model: TensorClus.coclustering, Fitted model

do_plot: boolean, Whether the plot should be displayed. True by default. Disabling this allows users to handle displaying the plot themselves.

save : boolean, False by default. Allowing save plot as image

dpi : int, 200 by default. Allowing to choose a specific resolution when saving image

TensorClus.vizualisation.__init__.plot_parameter_evolution(model, do_plot=True, save=False, dpi=200)[source]

Plot all intermediate gammaKK parameters for a model at each iteration.

model: TensorClus.coclustering, Fitted model

do_plot: boolean, Whether the plot should be displayed. True by default. Disabling this allows users to handle displaying the plot themselves.

save : boolean, False by default. Allowing save plot as image

dpi : int, 200 by default. Allowing to choose a specific resolution when saving image

TensorClus.vizualisation.__init__.plot_slice_reorganisation(data, model, slicesName=None, do_plot=True, save=False, dpi=200)[source]

Plot all intermediate modularities for a model.

data : tensor data

model: TensorClus.coclustering.CoclustMod, Fitted model

slicesName : list of slice names

do_plot: boolean, Whether the plot should be displayed. True by default. Disabling this allows users to handle displaying the plot themselves.

save : boolean, False by default. Allowing save plot as image

dpi : int, 200 by default. Allowing to choose a specific resolution when saving image