Open Access System for Information Sharing

Login Library

 

Article
Cited 29 time in webofscience Cited 31 time in scopus
Metadata Downloads
Full metadata record
Files in This Item:
There are no files associated with this item.
DC FieldValueLanguage
dc.contributor.authorJinoh Oh-
dc.contributor.authorHan, W.-S-
dc.contributor.authorHwangjo Yu-
dc.contributor.authorXiaoqian Jiang-
dc.date.accessioned2017-07-19T13:36:43Z-
dc.date.available2017-07-19T13:36:43Z-
dc.date.created2017-02-13-
dc.date.issued2015-08-
dc.identifier.issn0730-8078-
dc.identifier.urihttps://oasis.postech.ac.kr/handle/2014.oak/37322-
dc.description.abstractMatrix factorization is one of the fundamental techniques for analyzing latent relationship between two entities. Especially, it is used for recommendation for its high accuracy. Efficient parallel SGD matrix factorization algorithms have been developed for large matrices to speed up the convergence of factorization. However, most of them are designed for a shared-memory environment thus fail to factorize a large matrix that is too big to fit in memory, and their performances are also unreliable when the matrix is skewed. This paper proposes a fast and robust parallel SGD matrix factorization algorithm, called MLGF-MF, which is robust to skewed matrices and runs efficiently on block-storage devices (e.g., SSD disks) as well as shared-memory. MLGF-MF uses Multi-Level Grid File (MLGF) for partitioning the matrix and minimizes the cost for scheduling parallel SGD updates on the partitioned regions by exploiting partial match queries processing. Thereby, MLGF-MF produces reliable results efficiently even on skewed matrices. MLGF-MF is designed with asynchronous I/O permeated in the algorithm such that CPU keeps executing without waiting for I/O to complete. Thereby, MLGF-MF overlaps the CPU and I/O processing, which eventually offsets the I/O cost and maximizes the CPU utility. Recent flash SSD disks support high performance parallel I/O, thus are appropriate for executing the asynchronous I/O. From our extensive evaluations, MLGF-MF significantly outperforms (or converges faster than) the state-of-the-art algorithms in both shared-memory and block-storage environments. In addition, the outputs of MLGF-MF is significantly more robust to skewed matrices. Our implementation of MLGF-MF is available at http ://dm.postech.ac.kr/MLGF-MF as executable files.-
dc.languageEnglish-
dc.publisherACM-
dc.relation.isPartOfProceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining-
dc.titleFast and Robust Parallel SGD Matrix Factorization-
dc.typeArticle-
dc.identifier.doi10.1145/2783258.2783322-
dc.type.rimsART-
dc.identifier.bibliographicCitationProceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.865 - 874-
dc.identifier.wosid000485312900090-
dc.citation.endPage874-
dc.citation.startPage865-
dc.citation.titleProceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining-
dc.contributor.affiliatedAuthorHan, W.-S-
dc.identifier.scopusid2-s2.0-84954101095-
dc.description.journalClass1-
dc.description.journalClass1-
dc.description.scptc8*
dc.date.scptcdate2018-05-121*
dc.type.docTypeProceedings Paper-
dc.relation.journalWebOfScienceCategoryComputer Science, Artificial Intelligence-
dc.relation.journalWebOfScienceCategoryComputer Science, Information Systems-
dc.relation.journalWebOfScienceCategoryComputer Science, Theory & Methods-
dc.description.journalRegisteredClassscie-
dc.description.journalRegisteredClassscopus-
dc.relation.journalResearchAreaComputer Science-

qr_code

  • mendeley

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Views & Downloads

Browse