TensorClus reader

The TensorClus.reader module provides functions to load and read different data format.

TensorClus.reader.load.load_dataset(datasetName)[source]

Load one of the available dataset.

datasetNamestr

the name of dataset

tensor

three-way numpy array

labels

true row classes (ground-truth)

slices

slices name

TensorClus.reader.load.read_txt_tensor(filePath)[source]

read tensor data from text file.

filePathstr

the path of file

tensor

three-way numpy array

TensorClus.reader.load.save_txt_tensor(tensor, filePath)[source]

save tensor data as a text file.

tensor : tensor array

filePathstr

the path of file

TensorClus decomposition

The TensorClus.decomposition.decomposition_with_clustering module provides a class with common methods for multiple clustering alorihtm from decomposition results.

class TensorClus.decomposition.decomposition_with_clustering.DecompositionWithClustering(n_clusters=[2, 2, 2], modes=[1, 2, 3], algorithm='Kmeans++')[source]

Clustering from decomposition results.

n_clustersarray-like, optional, default: [2,2,2]

Number of row clusters to form

modesarray-like, optional, default: [1,2,3]

Selected modes for clustering

algorithmstring, optional, default: “kmeans++”

Selected algorithm for clustering

labels_array-like, shape (n_rows,)

clustering label of each row

fit(X, y=None)[source]

Perform Tensor co-clustering.

X : decomposition results

TensorClus coclustering

The TensorClus.coclustering.sparseTensorCoclustering module provides an implementation of a Sparse tensor co-clustering algorithm.

class TensorClus.coclustering.sparseTensorCoclustering.SparseTensorCoclusteringPoisson(n_clusters=2, fuzzy=True, init_row=None, init_col=None, max_iter=50, n_init=1, tol=1e-06, random_state=None, gpu=None)[source]

Tensor Latent Block Model for Poisson distribution.

n_row_clustersint, optional, default: 2

Number of row clusters to form

n_col_clustersint, optional, default: 2

Number of column clusters to form

fuzzyboolean, optional, default: True

Provide fuzzy clustering, If fuzzy is False a hard clustering is performed

init_rownumpy array or scipy sparse matrix, shape (n_rows, K), optional, default: None

Initial row labels

init_colnumpy array or scipy sparse matrix, shape (n_cols, L), optional, default: None

Initial column labels

max_iterint, optional, default: 20

Maximum number of iterations

n_initint, optional, default: 1

Number of time the algorithm will be run with different initializations. The final results will be the best output of n_init consecutive runs.

random_stateinteger or numpy.RandomState, optional

The generator used to initialize the centers. If an integer is given, it fixes the seed. Defaults to the global numpy random number generator.

tolfloat, default: 1e-9

Relative tolerance with regards to criterion to declare convergence

row_labels_array-like, shape (n_rows,)

Bicluster label of each row

column_labels_array-like, shape (n_cols,)

Bicluster label of each column

gamma_klarray-like, shape (k,l,v)

Value \(\frac{p_{kl}}{p_{k.} \times p_{.l}}\) for each row cluster k and column cluster l

gamma_kl_evolutionarray-like, shape(k,l,max_iter)

Value of gamma_kl of each bicluster according to iterations

F_c(x, z, w, gammakl, pi_k, rho_l, choice='ZW')[source]

Compute fuzzy log-likelihood (LL) criterion.

Xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features)

Tensor to be analyzed

znumpy array, shape= (n_row_objects, K)

matrix of row partition

wnumpy array, shape(d_col_objects, L)

matrix of column partition

gammaklthree-way numpy array, shape=(K,L, v_features)

matrix of bloc’s parameters

pi_knumpy array, shape(K,)

vector of row cluster proportion

rho_lnumpy array, shape(K,)

vector of column cluster proportion

choicestring, take values in (“Z”, “W”, “ZW”)

considering the optimization of LL

(H_z, H_w, LL, value)

(row entropy, column entropy, Log-likelihood, lower bound of log-likelihood)

fit(X, y=None)[source]

Perform Tensor co-clustering.

Xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features)

Tensor to be analyzed

gammakl(x, z, w)[source]

Perform Tensor co-clustering.

xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features)

Tensor to be analyzed

z : row partition w : column partition Returns ——- gamma_kl_mat

three-way numpy array, shape=(K,L, v_features) Computed parameters per block

pi_k(z)[source]

Compute row proportion.

znumpy array, shape= (n_row_objects, K)

matrix of row partition

pi_k_vect

numpy array, shape=(K) proportion of row clusters

rho_l(w)[source]

Compute column proportion.

wnumpy array, shape(d_col_objects, L)

matrix of column partition

rho_l_vect

numpy array, shape=(L) proportion of column clusters

The TensorClus.coclustering.tensorCoclusteringPoisson module provides an implementation of a tensor co-clustering algorithm for count three-way tensor.

class TensorClus.coclustering.tensorCoclusteringPoisson.TensorCoclusteringPoisson(n_row_clusters=2, n_col_clusters=2, fuzzy=True, init_row=None, init_col=None, max_iter=50, n_init=1, tol=1e-06, random_state=None, gpu=None)[source]

Tensor Latent Block Model for Poisson distribution.

n_row_clustersint, optional, default: 2

Number of row clusters to form

n_col_clustersint, optional, default: 2

Number of column clusters to form

fuzzyboolean, optional, default: True

Provide fuzzy clustering, If fuzzy is False a hard clustering is performed

init_rownumpy array or scipy sparse matrix, shape (n_rows, K), optional, default: None

Initial row labels

init_colnumpy array or scipy sparse matrix, shape (n_cols, L), optional, default: None

Initial column labels

max_iterint, optional, default: 20

Maximum number of iterations

n_initint, optional, default: 1

Number of time the algorithm will be run with different initializations. The final results will be the best output of n_init consecutive runs.

random_stateinteger or numpy.RandomState, optional

The generator used to initialize the centers. If an integer is given, it fixes the seed. Defaults to the global numpy random number generator.

tolfloat, default: 1e-9

Relative tolerance with regards to criterion to declare convergence

row_labels_array-like, shape (n_rows,)

Bicluster label of each row

column_labels_array-like, shape (n_cols,)

Bicluster label of each column

gamma_klarray-like, shape (k,l,v)

Value \(\frac{p_{kl}}{p_{k.} \times p_{.l}}\) for each row cluster k and column cluster l

gamma_kl_evolutionarray-like, shape(k,l,max_iter)

Value of gamma_kl of each bicluster according to iterations

F_c(x, z, w, gammakl, pi_k, rho_l, choice='ZW')[source]

Compute fuzzy log-likelihood (LL) criterion.

Xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features)

Tensor to be analyzed

znumpy array, shape= (n_row_objects, K)

matrix of row partition

wnumpy array, shape(d_col_objects, L)

matrix of column partition

gammaklthree-way numpy array, shape=(K,L, v_features)

matrix of bloc’s parameters

pi_knumpy array, shape(K,)

vector of row cluster proportion

rho_lnumpy array, shape(K,)

vector of column cluster proportion

choicestring, take values in (“Z”, “W”, “ZW”)

considering the optimization of LL

(H_z, H_w, LL, value)

(row entropy, column entropy, Log-likelihood, lower bound of log-likelihood)

fit(X, y=None)[source]

Perform Tensor co-clustering.

Xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features)

Tensor to be analyzed

gammakl(x, z, w)[source]

Compute gamma_kl per bloc.

Xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features)

Tensor to be analyzed

znumpy array, shape= (n_row_objects, K)

matrix of row partition

wnumpy array, shape(d_col_objects, L)

matrix of column partition

gamma_kl_mat

three-way numpy array, shape=(K,L, v_features) Computed parameters per block

pi_k(z)[source]

Compute row proportion.

znumpy array, shape= (n_row_objects, K)

matrix of row partition

pi_k_vect

numpy array, shape=(K) proportion of row clusters

rho_l(w)[source]

Compute column proportion. Parameters ———- w : numpy array, shape(d_col_objects, L)

matrix of column partition

rho_l_vect

numpy array, shape=(L) proportion of column clusters

The TensorClus.coclustering.tensorCoclusteringGaussian module provides an implementation of a tensor co-clustering algorithm for continous three-way tensor.

class TensorClus.coclustering.tensorCoclusteringGaussian.TensorCoclusteringGaussian(n_row_clusters=2, n_col_clusters=2, fuzzy=True, parsimonious=True, init_row=None, init_col=None, max_iter=50, n_init=1, tol=1e-06, random_state=None, gpu=None)[source]

Tensor Latent Block Model for Normal distribution.

n_row_clustersint, optional, default: 2

Number of row clusters to form

n_col_clustersint, optional, default: 2

Number of column clusters to form

fuzzyboolean, optional, default: True

Provide fuzzy clustering, If fuzzy is False a hard clustering is performed

parsimoniousboolean, optional, default: True

Provide parsimonious model, If parsimonious False sigma is computed at each iteration

init_rownumpy array or scipy sparse matrix, shape (n_rows, K), optional, default: None

Initial row labels

init_colnumpy array or scipy sparse matrix, shape (n_cols, L), optional, default: None

Initial column labels

max_iterint, optional, default: 20

Maximum number of iterations

n_initint, optional, default: 1

Number of time the algorithm will be run with different initializations. The final results will be the best output of n_init consecutive runs.

random_stateinteger or numpy.RandomState, optional

The generator used to initialize the centers. If an integer is given, it fixes the seed. Defaults to the global numpy random number generator.

tolfloat, default: 1e-9

Relative tolerance with regards to criterion to declare convergence

row_labels_array-like, shape (n_rows,)

Bicluster label of each row

column_labels_array-like, shape (n_cols,)

Bicluster label of each column

mu_klarray-like, shape (k,l,v)

Value :math: mean vector for each row cluster k and column cluster l

sigma_kl_array-like, shape (k,l,v,v)

Value of covariance matrix for each row cluster k and column cluster

F_c(x, z, w, mukl, sigma_x_kl, pi_k, rho_l, choice='ZW')[source]

Compute fuzzy log-likelihood (LL) criterion.

Xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features)

Tensor to be analyzed

znumpy array, shape= (n_row_objects, K)

matrix of row partition

wnumpy array, shape(d_col_objects, L)

matrix of column partition

muklthree-way numpy array, shape=(K,L, v_features)

matrix of mean parameter pe bloc

sigma_x_klFour-way numpy array, shape=(K,L,v_features, v_features)

tensor of sigma matrices for all blocks

pi_knumpy array, shape(K,)

vector of row cluster proportion

rho_lnumpy array, shape(K,)

vector of column cluster proportion

choicestring, take values in (“Z”, “W”, “ZW”)

considering the optimization of LL

(H_z, H_w, LL, value)

(row entropy, column entropy, Log-likelihood, lower bound of log-likelihood)

fit(X, y=None)[source]

Perform Tensor co-clustering.

Xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features)

Tensor to be analyzed

mukl(x, z, w)[source]

Compute the mean vector mu_kl per bloc.

Xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features)

Tensor to be analyzed

znumpy array, shape= (n_row_objects, K)

matrix of row partition

wnumpy array, shape(d_col_objects, L)

matrix of column partition

mukl_mat

three-way numpy array, shape=(K,L, v_features) Computed parameters per block

pi_k(z)[source]

Compute row proportion.

znumpy array, shape= (n_row_objects, K)

matrix of row partition

pi_k_vect

numpy array, shape=(K) proportion of row clusters

rho_l(w)[source]

Compute column proportion.

wnumpy array, shape(d_col_objects, L)

matrix of column partition

rho_l_vect

numpy array, shape=(L) proportion of column clusters

sigma_x_kl(x, z, w, mukl)[source]

Compute the mean vector sigma_kl per bloc.

Xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features)

Tensor to be analyzed

znumpy array, shape= (n_row_objects, K)

matrix of row partition

wnumpy array, shape(d_col_objects, L)

matrix of column partition

muklnumpy array, shape(K,L, v_features)

tensor of mukl values

sigma_x_kl_mat

three-way numpy array Computed the covariance parameters per block

The TensorClus.coclustering.tensorCoclusteringBernoulli module provides an implementation of a tensor co-clustering algorithm for binary three-way tensor.

class TensorClus.coclustering.tensorCoclusteringBernoulli.TensorCoclusteringBernoulli(n_row_clusters=2, n_col_clusters=2, fuzzy=False, init_row=None, init_col=None, max_iter=50, n_init=1, tol=1e-06, random_state=None, gpu=None)[source]

Tensor Latent Block Model for Bernoulli distribution.

n_row_clustersint, optional, default: 2

Number of row clusters to form

n_col_clustersint, optional, default: 2

Number of column clusters to form

fuzzyboolean, optional, default: True

Provide fuzzy clustering, If fuzzy is False a hard clustering is performed

init_rownumpy array or scipy sparse matrix, shape (n_rows, K), optional, default: None

Initial row labels

init_colnumpy array or scipy sparse matrix, shape (n_cols, L), optional, default: None

Initial column labels

max_iterint, optional, default: 20

Maximum number of iterations

n_initint, optional, default: 1

Number of time the algorithm will be run with different initializations. The final results will be the best output of n_init consecutive runs.

random_stateinteger or numpy.RandomState, optional

The generator used to initialize the centers. If an integer is given, it fixes the seed. Defaults to the global numpy random number generator.

tolfloat, default: 1e-9

Relative tolerance with regards to criterion to declare convergence

row_labels_array-like, shape (n_rows,)

Bicluster label of each row

column_labels_array-like, shape (n_cols,)

Bicluster label of each column

mu_klarray-like, shape (k,l,v)

Value :math: mean vector for each row cluster k and column cluster l

F_c(x, z, w, mukl, pi_k, rho_l, choice='ZW')[source]

Compute fuzzy log-likelihood (LL) criterion.

Xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features)

Tensor to be analyzed

znumpy array, shape= (n_row_objects, K)

matrix of row partition

wnumpy array, shape(d_col_objects, L)

matrix of column partition

muklthree-way numpy array, shape=(K,L, v_features)

matrix of mean parameter pe bloc

pi_knumpy array, shape(K,)

vector of row cluster proportion

rho_lnumpy array, shape(K,)

vector of column cluster proportion

choicestring, take values in (“Z”, “W”, “ZW”)

considering the optimization of LL

(H_z, H_w, LL, value)

(row entropy, column entropy, Log-likelihood, lower bound of log-likelihood)

fit(X, y=None)[source]

Perform Tensor co-clustering.

Xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features)

Tensor to be analyzed

mukl(x, z, w)[source]

Compute the mean vector mu_kl per bloc.

Xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features)

Tensor to be analyzed

znumpy array, shape= (n_row_objects, K)

matrix of row partition

wnumpy array, shape(d_col_objects, L)

matrix of column partition

mukl_mat

three-way numpy array

pi_k(z)[source]

Compute row proportion.

znumpy array, shape= (n_row_objects, K)

matrix of row partition

pi_k_vect

numpy array, shape=(K) proportion of row clusters

rho_l(w)[source]

Compute column proportion.

wnumpy array, shape(d_col_objects, L)

matrix of column partition

rho_l_vect

numpy array, shape=(L) proportion of column clusters

TensorClus vizualisation

The TensorClus.vizualisation module provides functions to visualize different measures or data.

TensorClus.vizualisation.__init__.Plot_CoClust_axes_etiquette(title, fig, axes, data, phiR, phiC, K, L, etiquette)[source]

Plot CoClustering results for each slice on specific axes.

title: title of figure

fig : figure that includes all axes

axes : list of axes corresponding to the number of slices

data : tensor data

phiR : row clustering partition

phiC : row clustering partition

K : number of row cluster

L : number of columns cluster

etiquette : name of slices

TensorClus.vizualisation.__init__.duplicates(lst, item)[source]

Find index of duplicated values.

lst: list of values item: values to determine

list

index of dipulicated values

TensorClus.vizualisation.__init__.generateColour()[source]

Generate random color.

str

hex color

TensorClus.vizualisation.__init__.plot_logLikelihood_evolution(model, do_plot=True, save=False, dpi=200)[source]

Plot all intermediate loglikelihood for a model at each iteration.

model: TensorClus.coclustering, Fitted model

do_plot: boolean, Whether the plot should be displayed. True by default. Disabling this allows users to handle displaying the plot themselves.

save : boolean, False by default. Allowing save plot as image

dpi : int, 200 by default. Allowing to choose a specific resolution when saving image

TensorClus.vizualisation.__init__.plot_parameter_evolution(model, do_plot=True, save=False, dpi=200)[source]

Plot all intermediate gammaKK parameters for a model at each iteration.

model: TensorClus.coclustering, Fitted model

do_plot: boolean, Whether the plot should be displayed. True by default. Disabling this allows users to handle displaying the plot themselves.

save : boolean, False by default. Allowing save plot as image

dpi : int, 200 by default. Allowing to choose a specific resolution when saving image

TensorClus.vizualisation.__init__.plot_slice_reorganisation(data, model, slicesName=None, do_plot=True, save=False, dpi=200)[source]

Plot all intermediate modularities for a model.

data : tensor data

model: TensorClus.coclustering.CoclustMod, Fitted model

slicesName : list of slice names

do_plot: boolean, Whether the plot should be displayed. True by default. Disabling this allows users to handle displaying the plot themselves.

save : boolean, False by default. Allowing save plot as image

dpi : int, 200 by default. Allowing to choose a specific resolution when saving image