Zero-Injection: Data and Code

This page provides the data and code that are used for the evaluations in the paper, "Told You I Didn't Like It: Exploiting Uninteresting Items for Effective Collaborative Filtering." In Proc. of the 32nd IEEE Int'l Conf. on Data Engineering (IEEE ICDE), Helsinki, Finland, May 16-20, 2016.




Data


  • We use the Movielens 100K dataset and CiaoDVD dataset.

- Download the Movielens dataset: http://grouplens.org/datasets/movielens/

- Download the Movielens dataset: http://www.librec.net/datasets.html/



Code


  • wALS in the OCCF method is implemented in the open-source GraphChi.

Download the GraphChi: https://github.com/GraphChi


  • The item-based, SVD-based methods, SVD++, PureSVD are implemented in the open-source MyMediaLite.
Download the MyMediaLite: http://www.mymedialite.net/

  • A core part of our approach, zero-injection is .
- Download the zero-injection: https://sites.google.com/a/agape.hanyang.ac.kr/dake-laboratory/icde-2016/zero-injection.zip?attredirects=0&d=1
-
Note: The idea and code are free for non-commercial academic use only.  If you are interested in their commercial use, please contact zero-injection@agape.hanyang.ac.kr by email. Also, we encourage you to cite our paper below if you use our idea and code in your work.



Paper

  • Title
"Told You I Didn't Like It": Exploiting Uninteresting Items for Effective Collaborative Filtering.
  • Abstract
We study how to improve the accuracy and running time of top-N recommendation with collaborative filtering (CF). Unlike existing work that uses mostly rated items (which is only a small fraction in a rating matrix), we propose the notion of pre-use preferences of users toward a vast amount of unrated items. Using this novel notion, we effectively identify uninteresting items that were not rated yet but are likely to receive very low ratings from users, and impute them as zero. This simple-yet-novel zero-injection method applied to a set of carefully-chosen uninteresting items not only addresses the sparsity problem by enriching a rating matrix but also completely prevents uninteresting items from being recommended as top-N items, thereby improving accuracy greatly. As our proposed idea is method-agnostic, it can be easily applied to a wide variety of popular CF methods. Through comprehensive experiments using the Movielens dataset and MyMediaLite implementation, we successfully demonstrate that our solution consistently and universally improves the accuracies of popular CF methods (e.g., item-based CF, SVD-based CF, and SVD++) by two to five orders of magnitude on average. Furthermore, our approach reduces the running time of those CF methods by 1.2 to 2.3 times when its setting produces the best accuracy.
  • Citation
@inproceedings{hwa16,
title={"Told You I Didn't Like It": Exploiting Uninteresting Items for Effective Collaborative Filtering},
author={Hwang, Won-Seok and Parc, Juan and Kim, Sang-Wook and Lee, Jongwuk and Lee, Dongwon},
booktitle={2016 IEEE 32nd International Conference on Data Engineering},
pages={349--360},
year={2016},
organization={IEEE}
}
  • Reference

Won-Seok Hwang, Juan Parc, Sang-Wook Kim, Jongwuk Lee, and Dongwon Lee, “Told You I Didn't Like It: Exploiting Uninteresting Items for Effective Collaborative Filtering,” In Proc. of the 32nd IEEE Int'l Conf. on Data Engineering (IEEE ICDE), Helsinki, Finland, May 16-20, 2016.


ċ
zero-injection.zip
(16101k)
황원석,
2016. 1. 24. 오전 2:13
Comments