zero package

Subpackages

Submodules

zero.als module

class zero.als.MangakiALS(nb_components=20, nb_iterations=40, lambda_=0.1, *args, **kwargs)[source]

Bases: zero.recommendation_algorithm.RecommendationAlgorithm

Alternating Least Squares \(r_{ij} - mean_i = u_i^T v_j\) Ratings are preprocessed by removing the mean rating of each user Then \(u_i\) and \(v_j\) are updated alternatively, using the least squares estimator (closed form)

ALS: Zhou, Yunhong, et al. “Large-scale parallel collaborative filtering for the netflix prize.” International Conference on Algorithmic Applications in Management. Springer, Berlin, Heidelberg, 2008. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.173.2797&rep=rep1&type=pdf

Implemented by Pierre Vigier, JJ Vie

factorize(matrix, random_state)[source]
fit(X, y)[source]
fit_single_user(rated_works, ratings)[source]
fit_user(user, matrix)[source]
fit_work(work, matrixT)[source]
get_shortname()[source]
property is_serializable
make_matrix(X, y)[source]
predict(X)[source]
predict_single_user(work_ids, user_parameters)[source]
unzip()[source]

zero.als2 module

class zero.als2.MangakiALS2(nb_components=20, nb_iterations=40, lambda_=0.1, *args, **kwargs)[source]

Bases: zero.recommendation_algorithm.RecommendationAlgorithm

Alternating Least Squares for “Singular Value Decomposition” model (aka latent factor model) \(r_{ij} - mean = bias_i + bias_j + u_i^T v_j\) Modified version of ALS for the SVD model Ratings are preprocessed by removing the overall mean Then (\(u_i\) and \(bias_i\)), (\(v_j\) and \(bias_j\)) are updated alternatively in closed form

ALS: Zhou, Yunhong, et al. “Large-scale parallel collaborative filtering for the netflix prize.” International Conference on Algorithmic Applications in Management. Springer, Berlin, Heidelberg, 2008. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.173.2797&rep=rep1&type=pdf

SVD: Koren, Yehuda, and Robert Bell. “Advances in collaborative filtering.” Recommender systems handbook. Springer, Boston, MA, 2015. 77-118. https://pdfs.semanticscholar.org/6800/fbe3314be9f638fb075e15b489d1aadb3030.pdf

factorize(matrix, random_state)[source]
fit(X, y)[source]
fit_user(user, matrix)[source]
fit_work(work, matrixT)[source]
get_shortname()[source]
property is_serializable
make_matrix(X, y)[source]
predict(X)[source]
unzip()[source]

zero.als3 module

class zero.als3.MangakiALS3(nb_components=20, nb_iterations=20, lambda_=0.1, *args, **kwargs)[source]

Bases: zero.recommendation_algorithm.RecommendationAlgorithm

Alternating Least Squares for “Singular Value Decomposition” model (aka latent factor model)

This implementation was supposed to be shorter and easier to read than MangakiALS2, but the performance is slightly worse, maybe because the initialization is different. (Gaussian instead of uniform; but Zhou’s paper suggested a Gaussian initialization)

fit(X, y)[source]
fit_user(user_id)[source]
fit_work(work_id)[source]
get_shortname()[source]
init_vars()[source]
property is_serializable
predict(X)[source]
to_dict(X, y)[source]
to_sparse(X, y)[source]

zero.balse module

class zero.balse.MangakiBALSE(nb_components=10, nb_iterations=10, lambda_=0.1, alpha=0.01, with_bias=True, gamma=5, T=None, *args, **kwargs)[source]

Bases: zero.recommendation_algorithm.RecommendationAlgorithm

M = None
U = None
VT = None
fit(X, y)[source]
get_shortname()[source]
predict(X)[source]

zero.cfm module

zero.chrono module

class zero.chrono.Chrono(is_enabled)[source]

Bases: object

checkpoint = None
connection = None
is_enabled = True
save(title)[source]

zero.dataset module

class zero.dataset.AnonymizedData(X, y, y_text, nb_users, nb_works)

Bases: tuple

X

Alias for field number 0

nb_users

Alias for field number 3

nb_works

Alias for field number 4

y

Alias for field number 1

y_text

Alias for field number 2

class zero.dataset.Dataset[source]

Bases: object

decode_users(encoded_user_ids)[source]
encode_works(work_ids)[source]
load_csv(filename, convert=<class 'float'>, title_filename=None)[source]
make_anonymous_data(triplets, convert=<function Dataset.<lambda>>, ordered=False)[source]
save_csv(folder, suffix='')[source]

zero.efa module

class zero.efa.MangakiEFA(NB_COMPONENTS=20, *args, **kwargs)[source]

Bases: zero.recommendation_algorithm.RecommendationAlgorithm

Factor Analysis: See http://scikit-learn.org/stable/modules/decomposition.html#factor-analysis Better way to get interpretable components, see MangakiNMF

fit(X, y, truncated=None)[source]
fit_user(user_id, sparse_matrix_dict)[source]
get_shortname()[source]
make_matrix(X, y)[source]
predict(X)[source]
set_parameters(nb_users, nb_works)[source]

zero.fma module

zero.gbr module

class zero.gbr.MangakiGBR(nb_components=20, nb_estimators=2, *args, **kwargs)[source]

Bases: zero.recommendation_algorithm.RecommendationAlgorithm

fit(X, y)[source]
get_shortname()[source]
predict(X)[source]
prepare_features(X, U, V)[source]

zero.knn module

class zero.knn.MangakiKNN(nb_neighbors=20, rated_by_neighbors_at_least=3, missing_is_mean=True, weighted_neighbors=False, nb_iterations=None, *args, **kwargs)[source]

Bases: zero.recommendation_algorithm.RecommendationAlgorithm

fit(X, y, whole_dataset=False)[source]
fit_single_user(rated_works, ratings)[source]
get_neighbors(user_ids=None)[source]
get_shortname()[source]
property is_serializable
predict(X)[source]
predict_single_user(work_ids, neighbor_ids)[source]
zero.knn.cosine_similarity(X, Y=None)[source]
zero.knn.mean_of_nonzero(X, cols)[source]
zero.knn.normalize(X)[source]

zero.knn0 module

class zero.knn0.MangakiKNN(nb_users, nb_works, nb_neighbors=20)[source]

Bases: object

fit(X, y)[source]
predict(X)[source]

zero.knn2 module

class zero.knn2.MangakiKNN2(nb_neighbors=20, *args, **kwargs)[source]

Bases: zero.recommendation_algorithm.RecommendationAlgorithm

Toy implementation (not usable in production) of KNN for the mere sake of science. \(N\) users, \(M\) ~ 10k works, \(P\) ~ 300k user-work pairs, \(K\) neighbors.

Algorithm: For each user-work pair (over all P pairs): - Find closest raters of user who rated this work (takes \(O(M \log M)\)) - Compute their average rating (takes \(O(K)\)) Complexity: \(O(P (M \log M + K))\) => Oops!

fit(X, y, whole_dataset=False)[source]
get_shortname()[source]
property is_serializable
predict(X)[source]

zero.lasso module

class zero.lasso.MangakiLASSO(with_bias=True, alpha=0.01, T=None, *args, **kwargs)[source]

Bases: zero.recommendation_algorithm.RecommendationAlgorithm

compute_user_sparsities()[source]
fit(X, y, autoload_tags=True)[source]
get_shortname()[source]
predict(X)[source]
zero.lasso.relu(x)[source]

zero.lasso0 module

class zero.lasso0.MangakiLASSO(nb_users, nb_works, T)[source]

Bases: object

fit(X, y)[source]
predict(X)[source]

zero.nmf module

zero.pca module

class zero.pca.MangakiPCA(NB_COMPONENTS=10, *args, **kwargs)[source]

Bases: zero.recommendation_algorithm.RecommendationAlgorithm

fit(X, y)[source]
get_shortname()[source]
make_matrix(X, y)[source]
predict(X)[source]

zero.recommendation_algorithm module

class zero.recommendation_algorithm.RecommendationAlgorithm(metrics=None, verbose_level=1)[source]

Bases: object

static available_evaluation_metrics()[source]
compute_dcg(y_pred, y_true)[source]

Computes the discounted cumulative gain as stated in: https://gist.github.com/bwhite/3726239

static compute_mae(y_pred, y_true)[source]
compute_metrics()[source]
compute_ndcg(y_pred, y_true)[source]
static compute_rmse(y_pred, y_true)[source]
dcg_at_k(r, k)[source]
delete_snapshot()[source]
factory = <zero.recommendation_algorithm.RecommendationAlgorithmFactory object>
get_backup_path(folder, filename)[source]
get_ranked_gains(y_pred, y_true)[source]
get_shortname()[source]
classmethod instantiate_algorithm(name)[source]
property is_serializable
classmethod list_available_algorithms()[source]
load(folder, filename=None)[source]

This function raises FileNotFoundException if no backup exists.

load_tags(T=None, perform_scaling=True, with_mean=False)[source]
ndcg_at_k(r, k)[source]
recommend(user_ids, extra_users_parameters=None, item_ids=None, k=None, method='mean')[source]

Recommend \(k\) items to a group of users.

Parameters:
  • user_ids – the users that are in the dataset of this algorithm.

  • extra_users_parameters – the parameters for users that weren’t.

  • item_ids – a subset of items. If is it None, then it is all items.

  • k – the number of items to recommend, if None then it is all items.

  • method – a way to combine the predictions. By default it is mean.

Returns:

a numpy array with two columns, item_id and recommendation score

Complexity:

\(O(N + K \log K)\)

classmethod register_algorithm(name, klass, default_kwargs=None)[source]
save(folder, filename=None)[source]
set_parameters(nb_users, nb_works)[source]
class zero.recommendation_algorithm.RecommendationAlgorithmFactory[source]

Bases: object

initialize()[source]
register(name, klass, default_kwargs)[source]
zero.recommendation_algorithm.register_algorithm(algorithm_name, default_kwargs=None)[source]

zero.sgd module

class zero.sgd.MangakiSGD(nb_components=20, nb_iterations=40, gamma=0.01, lambda_=0.1, *args, **kwargs)[source]

Bases: zero.recommendation_algorithm.RecommendationAlgorithm

fit(X, y)[source]
fit_single_user(rated_works, ratings)[source]
get_shortname()[source]
property is_serializable
predict(X)[source]
predict_one(i, j)[source]
predict_single_user(work_ids, user_parameters)[source]

zero.sgd0 module

class zero.sgd0.MangakiSGD(nb_users, nb_works, nb_components=20, nb_iterations=10, gamma=0.01, lambda_=0.1)[source]

Bases: object

fit(X, y)[source]
predict(X)[source]
predict_one(i, j)[source]

zero.sgd2 module

class zero.sgd2.MangakiSGD2(nb_components=20, nb_iterations=10, gamma=0.01, lambda_=0.1, batches=400, *args, **kwargs)[source]

Bases: zero.recommendation_algorithm.RecommendationAlgorithm

fit(X, y)[source]
fit_single_user(rated_works, ratings)[source]
get_shortname()[source]
property is_serializable
predict(X)[source]
predict_fm(X)[source]
predict_single_user(work_ids, user_parameters)[source]
zero.sgd2.onehotize(col, depth)[source]

zero.side module

class zero.side.SideInformation(T=None, perform_scaling=True, with_mean=False)[source]

Bases: object

load()[source]

zero.ssvd module

zero.svd module

Mangaki sparse SVD. Author: Jill-Jênn Vie, 2020

class zero.svd.MangakiSVD(nb_components=20, nb_iterations=None, *args, **kwargs)[source]

Bases: zero.recommendation_algorithm.RecommendationAlgorithm

Implementation of SVD with sparse matrices. It does not compute the whole matrix for recommendations but the production must be able to do sparse matrix operations effectively. It is 7x faster than svd1.py, and it only relies on numpy/scipy.

fit(X, y)[source]

Fit the SVD.

fit_single_user(rated_works, ratings)[source]

Fit the SVD for a single user.

get_shortname()[source]

Short name useful for logging output.

property is_serializable

Check whether we can save the model.

make_matrix(X, y)[source]

Make a sparse matrix out of X and y. X is a matrix of pairs (user_id, item_id), y are real values of ratings.

predict(X)[source]

Predict ratings for user, item pairs.

predict_single_user(work_ids, user_parameters)[source]

Predict ratings for a single user.

zero.svd.remove_mean(sp_matrix, axis=1)[source]

For each row (resp. column if axis is 0) of a sparse matrix, remove the mean of nonzero elements (resp. the mean of the column) from the nonzero elements.

zero.svd2 module

Mangaki sparse SVD. Author: Jill-Jênn Vie, 2020

class zero.svd2.MangakiSVD2(nb_components=20, nb_iterations=None, *args, **kwargs)[source]

Bases: zero.recommendation_algorithm.RecommendationAlgorithm

Implementation of SVD with sparse matrices. It does not compute the whole matrix for recommendations but the production should be able to do sparse matrix operations effectively. It is 7x faster than svd.py, and it only relies on numpy/scipy.

fit(X, y)[source]

Fit the SVD.

fit_single_user(rated_works, ratings)[source]

Fit the SVD for a single user.

get_shortname()[source]

Short name useful for logging output.

property is_serializable

Check whether we can save the model.

make_matrix(X, y)[source]

Make a sparse matrix out of X and y. X is a matrix of pairs (user_id, item_id), y are real values of ratings.

predict(X)[source]

Predict ratings for user, item pairs.

predict_single_user(work_ids, user_parameters)[source]

Predict ratings for a single user.

zero.svd2.remove_mean(sp_matrix)[source]

For each row of a sparse matrix, remove the mean of nonzero elements from the nonzero elements.

zero.values module

zero.wals module

zero.xals module

class zero.xals.MangakiXALS(nb_components=10, nb_iterations=10, lambda_=0.1, *args, **kwargs)[source]

Bases: zero.recommendation_algorithm.RecommendationAlgorithm

factorize(matrix, random_state)[source]
fit(X, y)[source]
fit_user(user, matrix)[source]
fit_work(work, matrixT)[source]
get_shortname()[source]
property is_serializable
make_matrix(X, y)[source]
predict(X)[source]

zero.zero module

class zero.zero.MangakiZero(*args, **kwargs)[source]

Bases: zero.recommendation_algorithm.RecommendationAlgorithm

fit(X, y)[source]
get_shortname()[source]
predict(X)[source]

Module contents