Options
Parallelized Frequent Item Set Mining Using a Tall and Skinny Matrix
Date Issued
02-07-2016
Author(s)
Pooja Janakiram, D.
Abstract
Big data applications consist of very large collection of small records, for example data from a retail website, data from movie streaming services, sensor data applications and many other such applications. Frequent item set mining is one of the common tools used for all these applications to generate recommendations to improve user experience of the website. Frequent itemset mining is also used to find interesting patterns on scientific databases such as gene expression database. One interesting way to represent such big data applications is by transforming them into tall and skinny matrices. In this paper we explore the concept of tall and skinny matrices to generate frequent item sets. The proposed algorithm is implemented on a map-reduce based framework such as Apache Spark and experiments are performed to test the scalability of the algorithm on a cloud platform.
Volume
0