TensorClus reader
The TensorClus.reader
module provides functions to load and read
different data format.
- TensorClus.reader.load.load_dataset(datasetName)[source]
Load one of the available dataset.
- datasetNamestr
the name of dataset
- tensor
three-way numpy array
- labels
true row classes (ground-truth)
- slices
slices name
TensorClus decomposition
The TensorClus.decomposition.decomposition_with_clustering
module provides a
class with common methods for multiple clustering alorihtm from decomposition results.
- class TensorClus.decomposition.decomposition_with_clustering.DecompositionWithClustering(n_clusters=[2, 2, 2], modes=[1, 2, 3], algorithm='Kmeans++')[source]
Clustering from decomposition results.
- n_clustersarray-like, optional, default: [2,2,2]
Number of row clusters to form
- modesarray-like, optional, default: [1,2,3]
Selected modes for clustering
- algorithmstring, optional, default: “kmeans++”
Selected algorithm for clustering
- labels_array-like, shape (n_rows,)
clustering label of each row
TensorClus coclustering
The TensorClus.coclustering.sparseTensorCoclustering
module provides an implementation
of a Sparse tensor co-clustering algorithm.
- class TensorClus.coclustering.sparseTensorCoclustering.SparseTensorCoclusteringPoisson(n_clusters=2, fuzzy=True, init_row=None, init_col=None, max_iter=50, n_init=1, tol=1e-06, random_state=None, gpu=None)[source]
Tensor Latent Block Model for Poisson distribution.
- n_row_clustersint, optional, default: 2
Number of row clusters to form
- n_col_clustersint, optional, default: 2
Number of column clusters to form
- fuzzyboolean, optional, default: True
Provide fuzzy clustering, If fuzzy is False a hard clustering is performed
- init_rownumpy array or scipy sparse matrix, shape (n_rows, K), optional, default: None
Initial row labels
- init_colnumpy array or scipy sparse matrix, shape (n_cols, L), optional, default: None
Initial column labels
- max_iterint, optional, default: 20
Maximum number of iterations
- n_initint, optional, default: 1
Number of time the algorithm will be run with different initializations. The final results will be the best output of n_init consecutive runs.
- random_stateinteger or numpy.RandomState, optional
The generator used to initialize the centers. If an integer is given, it fixes the seed. Defaults to the global numpy random number generator.
- tolfloat, default: 1e-9
Relative tolerance with regards to criterion to declare convergence
- row_labels_array-like, shape (n_rows,)
Bicluster label of each row
- column_labels_array-like, shape (n_cols,)
Bicluster label of each column
- gamma_klarray-like, shape (k,l,v)
Value \(\frac{p_{kl}}{p_{k.} \times p_{.l}}\) for each row cluster k and column cluster l
- gamma_kl_evolutionarray-like, shape(k,l,max_iter)
Value of gamma_kl of each bicluster according to iterations
- F_c(x, z, w, gammakl, pi_k, rho_l, choice='ZW')[source]
Compute fuzzy log-likelihood (LL) criterion.
- Xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features)
Tensor to be analyzed
- znumpy array, shape= (n_row_objects, K)
matrix of row partition
- wnumpy array, shape(d_col_objects, L)
matrix of column partition
- gammaklthree-way numpy array, shape=(K,L, v_features)
matrix of bloc’s parameters
- pi_knumpy array, shape(K,)
vector of row cluster proportion
- rho_lnumpy array, shape(K,)
vector of column cluster proportion
- choicestring, take values in (“Z”, “W”, “ZW”)
considering the optimization of LL
- (H_z, H_w, LL, value)
(row entropy, column entropy, Log-likelihood, lower bound of log-likelihood)
- fit(X, y=None)[source]
Perform Tensor co-clustering.
- Xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features)
Tensor to be analyzed
- gammakl(x, z, w)[source]
Perform Tensor co-clustering.
- xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features)
Tensor to be analyzed
z : row partition w : column partition Returns ——- gamma_kl_mat
three-way numpy array, shape=(K,L, v_features) Computed parameters per block
The TensorClus.coclustering.tensorCoclusteringPoisson
module provides an implementation
of a tensor co-clustering algorithm for count three-way tensor.
- class TensorClus.coclustering.tensorCoclusteringPoisson.TensorCoclusteringPoisson(n_row_clusters=2, n_col_clusters=2, fuzzy=True, init_row=None, init_col=None, max_iter=50, n_init=1, tol=1e-06, random_state=None, gpu=None)[source]
Tensor Latent Block Model for Poisson distribution.
- n_row_clustersint, optional, default: 2
Number of row clusters to form
- n_col_clustersint, optional, default: 2
Number of column clusters to form
- fuzzyboolean, optional, default: True
Provide fuzzy clustering, If fuzzy is False a hard clustering is performed
- init_rownumpy array or scipy sparse matrix, shape (n_rows, K), optional, default: None
Initial row labels
- init_colnumpy array or scipy sparse matrix, shape (n_cols, L), optional, default: None
Initial column labels
- max_iterint, optional, default: 20
Maximum number of iterations
- n_initint, optional, default: 1
Number of time the algorithm will be run with different initializations. The final results will be the best output of n_init consecutive runs.
- random_stateinteger or numpy.RandomState, optional
The generator used to initialize the centers. If an integer is given, it fixes the seed. Defaults to the global numpy random number generator.
- tolfloat, default: 1e-9
Relative tolerance with regards to criterion to declare convergence
- row_labels_array-like, shape (n_rows,)
Bicluster label of each row
- column_labels_array-like, shape (n_cols,)
Bicluster label of each column
- gamma_klarray-like, shape (k,l,v)
Value \(\frac{p_{kl}}{p_{k.} \times p_{.l}}\) for each row cluster k and column cluster l
- gamma_kl_evolutionarray-like, shape(k,l,max_iter)
Value of gamma_kl of each bicluster according to iterations
- F_c(x, z, w, gammakl, pi_k, rho_l, choice='ZW')[source]
Compute fuzzy log-likelihood (LL) criterion.
- Xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features)
Tensor to be analyzed
- znumpy array, shape= (n_row_objects, K)
matrix of row partition
- wnumpy array, shape(d_col_objects, L)
matrix of column partition
- gammaklthree-way numpy array, shape=(K,L, v_features)
matrix of bloc’s parameters
- pi_knumpy array, shape(K,)
vector of row cluster proportion
- rho_lnumpy array, shape(K,)
vector of column cluster proportion
- choicestring, take values in (“Z”, “W”, “ZW”)
considering the optimization of LL
- (H_z, H_w, LL, value)
(row entropy, column entropy, Log-likelihood, lower bound of log-likelihood)
- fit(X, y=None)[source]
Perform Tensor co-clustering.
- Xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features)
Tensor to be analyzed
- gammakl(x, z, w)[source]
Compute gamma_kl per bloc.
- Xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features)
Tensor to be analyzed
- znumpy array, shape= (n_row_objects, K)
matrix of row partition
- wnumpy array, shape(d_col_objects, L)
matrix of column partition
- gamma_kl_mat
three-way numpy array, shape=(K,L, v_features) Computed parameters per block
The TensorClus.coclustering.tensorCoclusteringGaussian
module provides an implementation
of a tensor co-clustering algorithm for continous three-way tensor.
- class TensorClus.coclustering.tensorCoclusteringGaussian.TensorCoclusteringGaussian(n_row_clusters=2, n_col_clusters=2, fuzzy=True, parsimonious=True, init_row=None, init_col=None, max_iter=50, n_init=1, tol=1e-06, random_state=None, gpu=None)[source]
Tensor Latent Block Model for Normal distribution.
- n_row_clustersint, optional, default: 2
Number of row clusters to form
- n_col_clustersint, optional, default: 2
Number of column clusters to form
- fuzzyboolean, optional, default: True
Provide fuzzy clustering, If fuzzy is False a hard clustering is performed
- parsimoniousboolean, optional, default: True
Provide parsimonious model, If parsimonious False sigma is computed at each iteration
- init_rownumpy array or scipy sparse matrix, shape (n_rows, K), optional, default: None
Initial row labels
- init_colnumpy array or scipy sparse matrix, shape (n_cols, L), optional, default: None
Initial column labels
- max_iterint, optional, default: 20
Maximum number of iterations
- n_initint, optional, default: 1
Number of time the algorithm will be run with different initializations. The final results will be the best output of n_init consecutive runs.
- random_stateinteger or numpy.RandomState, optional
The generator used to initialize the centers. If an integer is given, it fixes the seed. Defaults to the global numpy random number generator.
- tolfloat, default: 1e-9
Relative tolerance with regards to criterion to declare convergence
- row_labels_array-like, shape (n_rows,)
Bicluster label of each row
- column_labels_array-like, shape (n_cols,)
Bicluster label of each column
- mu_klarray-like, shape (k,l,v)
Value :math: mean vector for each row cluster k and column cluster l
- sigma_kl_array-like, shape (k,l,v,v)
Value of covariance matrix for each row cluster k and column cluster
- F_c(x, z, w, mukl, sigma_x_kl, pi_k, rho_l, choice='ZW')[source]
Compute fuzzy log-likelihood (LL) criterion.
- Xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features)
Tensor to be analyzed
- znumpy array, shape= (n_row_objects, K)
matrix of row partition
- wnumpy array, shape(d_col_objects, L)
matrix of column partition
- muklthree-way numpy array, shape=(K,L, v_features)
matrix of mean parameter pe bloc
- sigma_x_klFour-way numpy array, shape=(K,L,v_features, v_features)
tensor of sigma matrices for all blocks
- pi_knumpy array, shape(K,)
vector of row cluster proportion
- rho_lnumpy array, shape(K,)
vector of column cluster proportion
- choicestring, take values in (“Z”, “W”, “ZW”)
considering the optimization of LL
- (H_z, H_w, LL, value)
(row entropy, column entropy, Log-likelihood, lower bound of log-likelihood)
- fit(X, y=None)[source]
Perform Tensor co-clustering.
- Xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features)
Tensor to be analyzed
- mukl(x, z, w)[source]
Compute the mean vector mu_kl per bloc.
- Xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features)
Tensor to be analyzed
- znumpy array, shape= (n_row_objects, K)
matrix of row partition
- wnumpy array, shape(d_col_objects, L)
matrix of column partition
- mukl_mat
three-way numpy array, shape=(K,L, v_features) Computed parameters per block
- pi_k(z)[source]
Compute row proportion.
- znumpy array, shape= (n_row_objects, K)
matrix of row partition
- pi_k_vect
numpy array, shape=(K) proportion of row clusters
- rho_l(w)[source]
Compute column proportion.
- wnumpy array, shape(d_col_objects, L)
matrix of column partition
- rho_l_vect
numpy array, shape=(L) proportion of column clusters
- sigma_x_kl(x, z, w, mukl)[source]
Compute the mean vector sigma_kl per bloc.
- Xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features)
Tensor to be analyzed
- znumpy array, shape= (n_row_objects, K)
matrix of row partition
- wnumpy array, shape(d_col_objects, L)
matrix of column partition
- muklnumpy array, shape(K,L, v_features)
tensor of mukl values
- sigma_x_kl_mat
three-way numpy array Computed the covariance parameters per block
The TensorClus.coclustering.tensorCoclusteringBernoulli
module provides an implementation
of a tensor co-clustering algorithm for binary three-way tensor.
- class TensorClus.coclustering.tensorCoclusteringBernoulli.TensorCoclusteringBernoulli(n_row_clusters=2, n_col_clusters=2, fuzzy=False, init_row=None, init_col=None, max_iter=50, n_init=1, tol=1e-06, random_state=None, gpu=None)[source]
Tensor Latent Block Model for Bernoulli distribution.
- n_row_clustersint, optional, default: 2
Number of row clusters to form
- n_col_clustersint, optional, default: 2
Number of column clusters to form
- fuzzyboolean, optional, default: True
Provide fuzzy clustering, If fuzzy is False a hard clustering is performed
- init_rownumpy array or scipy sparse matrix, shape (n_rows, K), optional, default: None
Initial row labels
- init_colnumpy array or scipy sparse matrix, shape (n_cols, L), optional, default: None
Initial column labels
- max_iterint, optional, default: 20
Maximum number of iterations
- n_initint, optional, default: 1
Number of time the algorithm will be run with different initializations. The final results will be the best output of n_init consecutive runs.
- random_stateinteger or numpy.RandomState, optional
The generator used to initialize the centers. If an integer is given, it fixes the seed. Defaults to the global numpy random number generator.
- tolfloat, default: 1e-9
Relative tolerance with regards to criterion to declare convergence
- row_labels_array-like, shape (n_rows,)
Bicluster label of each row
- column_labels_array-like, shape (n_cols,)
Bicluster label of each column
- mu_klarray-like, shape (k,l,v)
Value :math: mean vector for each row cluster k and column cluster l
- F_c(x, z, w, mukl, pi_k, rho_l, choice='ZW')[source]
Compute fuzzy log-likelihood (LL) criterion.
- Xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features)
Tensor to be analyzed
- znumpy array, shape= (n_row_objects, K)
matrix of row partition
- wnumpy array, shape(d_col_objects, L)
matrix of column partition
- muklthree-way numpy array, shape=(K,L, v_features)
matrix of mean parameter pe bloc
- pi_knumpy array, shape(K,)
vector of row cluster proportion
- rho_lnumpy array, shape(K,)
vector of column cluster proportion
- choicestring, take values in (“Z”, “W”, “ZW”)
considering the optimization of LL
- (H_z, H_w, LL, value)
(row entropy, column entropy, Log-likelihood, lower bound of log-likelihood)
- fit(X, y=None)[source]
Perform Tensor co-clustering.
- Xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features)
Tensor to be analyzed
- mukl(x, z, w)[source]
Compute the mean vector mu_kl per bloc.
- Xthree-way numpy array, shape=(n_row_objects,d_col_objects, v_features)
Tensor to be analyzed
- znumpy array, shape= (n_row_objects, K)
matrix of row partition
- wnumpy array, shape(d_col_objects, L)
matrix of column partition
- mukl_mat
three-way numpy array
TensorClus vizualisation
The TensorClus.vizualisation
module provides functions to visualize
different measures or data.
- TensorClus.vizualisation.__init__.Plot_CoClust_axes_etiquette(title, fig, axes, data, phiR, phiC, K, L, etiquette)[source]
Plot CoClustering results for each slice on specific axes.
title: title of figure
fig : figure that includes all axes
axes : list of axes corresponding to the number of slices
data : tensor data
phiR : row clustering partition
phiC : row clustering partition
K : number of row cluster
L : number of columns cluster
etiquette : name of slices
- TensorClus.vizualisation.__init__.duplicates(lst, item)[source]
Find index of duplicated values.
lst: list of values item: values to determine
- list
index of dipulicated values
- TensorClus.vizualisation.__init__.plot_logLikelihood_evolution(model, do_plot=True, save=False, dpi=200)[source]
Plot all intermediate loglikelihood for a model at each iteration.
model:
TensorClus.coclustering
, Fitted modeldo_plot: boolean, Whether the plot should be displayed. True by default. Disabling this allows users to handle displaying the plot themselves.
save : boolean, False by default. Allowing save plot as image
dpi : int, 200 by default. Allowing to choose a specific resolution when saving image
- TensorClus.vizualisation.__init__.plot_parameter_evolution(model, do_plot=True, save=False, dpi=200)[source]
Plot all intermediate gammaKK parameters for a model at each iteration.
model:
TensorClus.coclustering
, Fitted modeldo_plot: boolean, Whether the plot should be displayed. True by default. Disabling this allows users to handle displaying the plot themselves.
save : boolean, False by default. Allowing save plot as image
dpi : int, 200 by default. Allowing to choose a specific resolution when saving image
- TensorClus.vizualisation.__init__.plot_slice_reorganisation(data, model, slicesName=None, do_plot=True, save=False, dpi=200)[source]
Plot all intermediate modularities for a model.
data : tensor data
model:
TensorClus.coclustering.CoclustMod
, Fitted modelslicesName : list of slice names
do_plot: boolean, Whether the plot should be displayed. True by default. Disabling this allows users to handle displaying the plot themselves.
save : boolean, False by default. Allowing save plot as image
dpi : int, 200 by default. Allowing to choose a specific resolution when saving image