zero package¶

Subpackages¶

Submodules¶

zero.als module¶

class zero.als.MangakiALS(nb_components=20, nb_iterations=40, lambda_=0.1, *args, **kwargs)[source]¶

Bases: zero.recommendation_algorithm.RecommendationAlgorithm

Alternating Least Squares \(r_{ij} - mean_i = u_i^T v_j\) Ratings are preprocessed by removing the mean rating of each user Then \(u_i\) and \(v_j\) are updated alternatively, using the least squares estimator (closed form)

ALS: Zhou, Yunhong, et al. “Large-scale parallel collaborative filtering for the netflix prize.” International Conference on Algorithmic Applications in Management. Springer, Berlin, Heidelberg, 2008. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.173.2797&rep=rep1&type=pdf

Implemented by Pierre Vigier, JJ Vie

factorize(matrix, random_state)[source]¶

fit(X, y)[source]¶

fit_single_user(rated_works, ratings)[source]¶

fit_user(user, matrix)[source]¶

fit_work(work, matrixT)[source]¶

get_shortname()[source]¶

property is_serializable¶

make_matrix(X, y)[source]¶

predict(X)[source]¶

predict_single_user(work_ids, user_parameters)[source]¶

unzip()[source]¶

zero.als2 module¶

class zero.als2.MangakiALS2(nb_components=20, nb_iterations=40, lambda_=0.1, *args, **kwargs)[source]¶

Bases: zero.recommendation_algorithm.RecommendationAlgorithm

Alternating Least Squares for “Singular Value Decomposition” model (aka latent factor model) \(r_{ij} - mean = bias_i + bias_j + u_i^T v_j\) Modified version of ALS for the SVD model Ratings are preprocessed by removing the overall mean Then (\(u_i\) and \(bias_i\)), (\(v_j\) and \(bias_j\)) are updated alternatively in closed form

ALS: Zhou, Yunhong, et al. “Large-scale parallel collaborative filtering for the netflix prize.” International Conference on Algorithmic Applications in Management. Springer, Berlin, Heidelberg, 2008. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.173.2797&rep=rep1&type=pdf

SVD: Koren, Yehuda, and Robert Bell. “Advances in collaborative filtering.” Recommender systems handbook. Springer, Boston, MA, 2015. 77-118. https://pdfs.semanticscholar.org/6800/fbe3314be9f638fb075e15b489d1aadb3030.pdf

factorize(matrix, random_state)[source]¶

fit(X, y)[source]¶

fit_user(user, matrix)[source]¶

fit_work(work, matrixT)[source]¶

get_shortname()[source]¶

property is_serializable¶

make_matrix(X, y)[source]¶

predict(X)[source]¶

unzip()[source]¶

zero.als3 module¶

class zero.als3.MangakiALS3(nb_components=20, nb_iterations=20, lambda_=0.1, *args, **kwargs)[source]¶

Bases: zero.recommendation_algorithm.RecommendationAlgorithm

Alternating Least Squares for “Singular Value Decomposition” model (aka latent factor model)

This implementation was supposed to be shorter and easier to read than MangakiALS2, but the performance is slightly worse, maybe because the initialization is different. (Gaussian instead of uniform; but Zhou’s paper suggested a Gaussian initialization)

fit(X, y)[source]¶

fit_user(user_id)[source]¶

fit_work(work_id)[source]¶

get_shortname()[source]¶

init_vars()[source]¶

property is_serializable¶

predict(X)[source]¶

to_dict(X, y)[source]¶

to_sparse(X, y)[source]¶

zero.balse module¶

class zero.balse.MangakiBALSE(nb_components=10, nb_iterations=10, lambda_=0.1, alpha=0.01, with_bias=True, gamma=5, T=None, *args, **kwargs)[source]¶

Bases: zero.recommendation_algorithm.RecommendationAlgorithm

M = None¶

U = None¶

VT = None¶

fit(X, y)[source]¶

get_shortname()[source]¶

predict(X)[source]¶

zero.cfm module¶

zero.chrono module¶

class zero.chrono.Chrono(is_enabled)[source]¶

Bases: object

checkpoint = None¶

connection = None¶

is_enabled = True¶

save(title)[source]¶

zero.dataset module¶

class zero.dataset.AnonymizedData(X, y, y_text, nb_users, nb_works)¶

Bases: tuple

X¶: Alias for field number 0

nb_users¶: Alias for field number 3

nb_works¶: Alias for field number 4

y¶: Alias for field number 1

y_text¶: Alias for field number 2

class zero.dataset.Dataset[source]¶

Bases: object

decode_users(encoded_user_ids)[source]¶

encode_works(work_ids)[source]¶

load_csv(filename, convert=<class 'float'>, title_filename=None)[source]¶

make_anonymous_data(triplets, convert=<function Dataset.<lambda>>, ordered=False)[source]¶

save_csv(folder, suffix='')[source]¶

zero.efa module¶

class zero.efa.MangakiEFA(NB_COMPONENTS=20, *args, **kwargs)[source]¶

Bases: zero.recommendation_algorithm.RecommendationAlgorithm

Factor Analysis: See http://scikit-learn.org/stable/modules/decomposition.html#factor-analysis Better way to get interpretable components, see MangakiNMF

fit(X, y, truncated=None)[source]¶

fit_user(user_id, sparse_matrix_dict)[source]¶

get_shortname()[source]¶

make_matrix(X, y)[source]¶

predict(X)[source]¶

set_parameters(nb_users, nb_works)[source]¶

zero.fma module¶

zero.gbr module¶

class zero.gbr.MangakiGBR(nb_components=20, nb_estimators=2, *args, **kwargs)[source]¶

Bases: zero.recommendation_algorithm.RecommendationAlgorithm

fit(X, y)[source]¶

get_shortname()[source]¶

predict(X)[source]¶

prepare_features(X, U, V)[source]¶

zero.knn module¶

class zero.knn.MangakiKNN(nb_neighbors=20, rated_by_neighbors_at_least=3, missing_is_mean=True, weighted_neighbors=False, nb_iterations=None, *args, **kwargs)[source]¶

Bases: zero.recommendation_algorithm.RecommendationAlgorithm

fit(X, y, whole_dataset=False)[source]¶

fit_single_user(rated_works, ratings)[source]¶

get_neighbors(user_ids=None)[source]¶

get_shortname()[source]¶

property is_serializable¶

predict(X)[source]¶

predict_single_user(work_ids, neighbor_ids)[source]¶

zero.knn.cosine_similarity(X, Y=None)[source]¶

zero.knn.mean_of_nonzero(X, cols)[source]¶

zero.knn.normalize(X)[source]¶

zero.knn0 module¶

class zero.knn0.MangakiKNN(nb_users, nb_works, nb_neighbors=20)[source]¶

Bases: object

fit(X, y)[source]¶

predict(X)[source]¶

zero.knn2 module¶

class zero.knn2.MangakiKNN2(nb_neighbors=20, *args, **kwargs)[source]¶

Bases: zero.recommendation_algorithm.RecommendationAlgorithm

Toy implementation (not usable in production) of KNN for the mere sake of science. \(N\) users, \(M\) ~ 10k works, \(P\) ~ 300k user-work pairs, \(K\) neighbors.

Algorithm: For each user-work pair (over all P pairs): - Find closest raters of user who rated this work (takes \(O(M \log M)\)) - Compute their average rating (takes \(O(K)\)) Complexity: \(O(P (M \log M + K))\) => Oops!

fit(X, y, whole_dataset=False)[source]¶

get_shortname()[source]¶

property is_serializable¶

predict(X)[source]¶

zero.lasso module¶

class zero.lasso.MangakiLASSO(with_bias=True, alpha=0.01, T=None, *args, **kwargs)[source]¶

Bases: zero.recommendation_algorithm.RecommendationAlgorithm

compute_user_sparsities()[source]¶

fit(X, y, autoload_tags=True)[source]¶

get_shortname()[source]¶

predict(X)[source]¶

zero.lasso.relu(x)[source]¶

zero.lasso0 module¶

class zero.lasso0.MangakiLASSO(nb_users, nb_works, T)[source]¶

Bases: object

fit(X, y)[source]¶

predict(X)[source]¶

zero.nmf module¶

zero.pca module¶

class zero.pca.MangakiPCA(NB_COMPONENTS=10, *args, **kwargs)[source]¶

Bases: zero.recommendation_algorithm.RecommendationAlgorithm

fit(X, y)[source]¶

get_shortname()[source]¶

make_matrix(X, y)[source]¶

predict(X)[source]¶

zero.recommendation_algorithm module¶

class zero.recommendation_algorithm.RecommendationAlgorithm(metrics=None, verbose_level=1)[source]¶

Bases: object

static available_evaluation_metrics()[source]¶

compute_dcg(y_pred, y_true)[source]¶: Computes the discounted cumulative gain as stated in: https://gist.github.com/bwhite/3726239

static compute_mae(y_pred, y_true)[source]¶

compute_metrics()[source]¶

compute_ndcg(y_pred, y_true)[source]¶

static compute_rmse(y_pred, y_true)[source]¶

dcg_at_k(r, k)[source]¶

delete_snapshot()[source]¶

factory = <zero.recommendation_algorithm.RecommendationAlgorithmFactory object>¶

get_backup_path(folder, filename)[source]¶

get_ranked_gains(y_pred, y_true)[source]¶

get_shortname()[source]¶

classmethod instantiate_algorithm(name)[source]¶

property is_serializable¶

classmethod list_available_algorithms()[source]¶

load(folder, filename=None)[source]¶: This function raises FileNotFoundException if no backup exists.

load_tags(T=None, perform_scaling=True, with_mean=False)[source]¶

ndcg_at_k(r, k)[source]¶

recommend(user_ids, extra_users_parameters=None, item_ids=None, k=None, method='mean')[source]¶

Recommend \(k\) items to a group of users.

Parameters:

user_ids – the users that are in the dataset of this algorithm.
extra_users_parameters – the parameters for users that weren’t.
item_ids – a subset of items. If is it None, then it is all items.
k – the number of items to recommend, if None then it is all items.
method – a way to combine the predictions. By default it is mean.

Returns:

a numpy array with two columns, item_id and recommendation score

Complexity:

\(O(N + K \log K)\)

classmethod register_algorithm(name, klass, default_kwargs=None)[source]¶

save(folder, filename=None)[source]¶

set_parameters(nb_users, nb_works)[source]¶

class zero.recommendation_algorithm.RecommendationAlgorithmFactory[source]¶

Bases: object

initialize()[source]¶

register(name, klass, default_kwargs)[source]¶

zero.recommendation_algorithm.register_algorithm(algorithm_name, default_kwargs=None)[source]¶

zero.sgd module¶

class zero.sgd.MangakiSGD(nb_components=20, nb_iterations=40, gamma=0.01, lambda_=0.1, *args, **kwargs)[source]¶

Bases: zero.recommendation_algorithm.RecommendationAlgorithm

fit(X, y)[source]¶

fit_single_user(rated_works, ratings)[source]¶

get_shortname()[source]¶

property is_serializable¶

predict(X)[source]¶

predict_one(i, j)[source]¶

predict_single_user(work_ids, user_parameters)[source]¶

zero.sgd0 module¶

class zero.sgd0.MangakiSGD(nb_users, nb_works, nb_components=20, nb_iterations=10, gamma=0.01, lambda_=0.1)[source]¶

Bases: object

fit(X, y)[source]¶

predict(X)[source]¶

predict_one(i, j)[source]¶

zero.sgd2 module¶

class zero.sgd2.MangakiSGD2(nb_components=20, nb_iterations=10, gamma=0.01, lambda_=0.1, batches=400, *args, **kwargs)[source]¶

Bases: zero.recommendation_algorithm.RecommendationAlgorithm

fit(X, y)[source]¶

fit_single_user(rated_works, ratings)[source]¶

get_shortname()[source]¶

property is_serializable¶

predict(X)[source]¶

predict_fm(X)[source]¶

predict_single_user(work_ids, user_parameters)[source]¶

zero.sgd2.onehotize(col, depth)[source]¶

zero.side module¶

class zero.side.SideInformation(T=None, perform_scaling=True, with_mean=False)[source]¶

Bases: object

load()[source]¶

zero.ssvd module¶

zero.svd module¶

Mangaki sparse SVD. Author: Jill-Jênn Vie, 2020

class zero.svd.MangakiSVD(nb_components=20, nb_iterations=None, *args, **kwargs)[source]¶

Bases: zero.recommendation_algorithm.RecommendationAlgorithm

Implementation of SVD with sparse matrices. It does not compute the whole matrix for recommendations but the production must be able to do sparse matrix operations effectively. It is 7x faster than svd1.py, and it only relies on numpy/scipy.

fit(X, y)[source]¶: Fit the SVD.

fit_single_user(rated_works, ratings)[source]¶: Fit the SVD for a single user.

get_shortname()[source]¶: Short name useful for logging output.

property is_serializable¶: Check whether we can save the model.

make_matrix(X, y)[source]¶: Make a sparse matrix out of X and y. X is a matrix of pairs (user_id, item_id), y are real values of ratings.

predict(X)[source]¶: Predict ratings for user, item pairs.

predict_single_user(work_ids, user_parameters)[source]¶: Predict ratings for a single user.

zero.svd.remove_mean(sp_matrix, axis=1)[source]¶: For each row (resp. column if axis is 0) of a sparse matrix, remove the mean of nonzero elements (resp. the mean of the column) from the nonzero elements.

zero.svd2 module¶

Mangaki sparse SVD. Author: Jill-Jênn Vie, 2020

class zero.svd2.MangakiSVD2(nb_components=20, nb_iterations=None, *args, **kwargs)[source]¶

Bases: zero.recommendation_algorithm.RecommendationAlgorithm

Implementation of SVD with sparse matrices. It does not compute the whole matrix for recommendations but the production should be able to do sparse matrix operations effectively. It is 7x faster than svd.py, and it only relies on numpy/scipy.

fit(X, y)[source]¶: Fit the SVD.

fit_single_user(rated_works, ratings)[source]¶: Fit the SVD for a single user.

get_shortname()[source]¶: Short name useful for logging output.

property is_serializable¶: Check whether we can save the model.

make_matrix(X, y)[source]¶: Make a sparse matrix out of X and y. X is a matrix of pairs (user_id, item_id), y are real values of ratings.

predict(X)[source]¶: Predict ratings for user, item pairs.

predict_single_user(work_ids, user_parameters)[source]¶: Predict ratings for a single user.

zero.svd2.remove_mean(sp_matrix)[source]¶: For each row of a sparse matrix, remove the mean of nonzero elements from the nonzero elements.

zero.values module¶

zero.wals module¶

zero.xals module¶

class zero.xals.MangakiXALS(nb_components=10, nb_iterations=10, lambda_=0.1, *args, **kwargs)[source]¶

Bases: zero.recommendation_algorithm.RecommendationAlgorithm

factorize(matrix, random_state)[source]¶

fit(X, y)[source]¶

fit_user(user, matrix)[source]¶

fit_work(work, matrixT)[source]¶

get_shortname()[source]¶

property is_serializable¶

make_matrix(X, y)[source]¶

predict(X)[source]¶

zero.zero module¶

class zero.zero.MangakiZero(*args, **kwargs)[source]¶

Bases: zero.recommendation_algorithm.RecommendationAlgorithm

fit(X, y)[source]¶

get_shortname()[source]¶

predict(X)[source]¶

zero package¶

Subpackages¶

Submodules¶

zero.als module¶

zero.als2 module¶

zero.als3 module¶

zero.balse module¶

zero.cfm module¶

zero.chrono module¶

zero.dataset module¶

zero.efa module¶

zero.fma module¶

zero.gbr module¶

zero.knn module¶

zero.knn0 module¶

zero.knn2 module¶

zero.lasso module¶

zero.lasso0 module¶

zero.nmf module¶

zero.pca module¶

zero.recommendation_algorithm module¶

zero.sgd module¶

zero.sgd0 module¶

zero.sgd2 module¶

zero.side module¶

zero.ssvd module¶

zero.svd module¶

zero.svd2 module¶

zero.values module¶

zero.wals module¶

zero.xals module¶

zero.zero module¶

Module contents¶

Mangaki Zero

Navigation

Related Topics