Data Clustering: Algorithms and Applications

Data Clustering: Algorithms and Applications
Edited by Charu C. Aggarwal, Chandan K. Reddy

August 21, 2013 by Chapman and Hall/CRC
Reference – 652 Pages – 102 B/W Illustrations
ISBN 9781466558212 – CAT# K15510
Series: Chapman & Hall/CRC Data Mining and Knowledge Discovery Series

Preview this book.


-Presents core methods for data clustering, including probabilistic, density- and grid-based, and spectral clustering
-Explores various problems and scenarios pertaining to multimedia, text, biological, categorical, network, streams, and uncertain data
-Offers in-depth insight on the clustering process, including different ways to cluster the same data set
-Includes an extensive bibliography at the end of each chapter

Research on the problem of clustering tends to be fragmented across the pattern recognition, database, data mining, and machine learning communities. Addressing this problem in a unified way, Data Clustering: Algorithms and Applications provides complete coverage of the entire area of clustering, from basic methods to more refined and complex data clustering approaches. It pays special attention to recent issues in graphs, social networks, and other domains.

The book focuses on three primary aspects of data clustering:

Methods, describing key techniques commonly used for clustering, such as feature selection, agglomerative clustering, partitional clustering, density-based clustering, probabilistic clustering, grid-based clustering, spectral clustering, and nonnegative matrix factorization

Domains, covering methods used for different domains of data, such as categorical data, text data, multimedia data, graph data, biological data, stream data, uncertain data, time series clustering, high-dimensional clustering, and big data

Variations and Insights, discussing important variations of the clustering process, such as semisupervised clustering, interactive clustering, multiview clustering, cluster ensembles, and cluster validation

In this book, top researchers from around the world explore the characteristics of clustering problems in a variety of application areas. They also explain how to glean detailed insight from the clustering process—including how to verify the quality of the underlying clusters—through supervision, human intervention, or the automated generation of alternative clusters.

Table of Contents

An Introduction to Cluster Analysis Charu C. Aggarwal

Feature Selection for Clustering: A Review Salem Alelyani, Jiliang Tang, and Huan Liu

Probabilistic Models for Clustering Hongbo Deng and Jiawei Han

A Survey of Partitional and Hierarchical Clustering Algorithms Chandan K. Reddy and Bhanukiran Vinzamuri

Density-Based Clustering Martin Ester

Grid-Based Clustering Wei Cheng, Wei Wang, and Sandra Batista

Non-Negative Matrix Factorizations for Clustering: A Survey Tao Li and Chris Ding

Spectral Clustering Jialu Liu and Jiawei Han

Clustering High-Dimensional Data Arthur Zimek

A Survey of Stream Clustering Algorithms Charu C. Aggarwal

Big Data Clustering Hanghang Tong and U. Kang

Clustering Categorical Data Bill Andreopoulos

Document Clustering: The Next Frontier David C. Anastasiu, Andrea Tagarelli, and George Karypis

Clustering Multimedia Data Shen-Fu Tsai, Guo-Jun Qi, Shiyu Chang, Min-Hsuan Tsai, and Thomas S. Huang

Time Series Data Clustering Dimitrios Kotsakos, Goce Trajcevski, Dimitrios Gunopulos, and Charu C. Aggarwal

Clustering Biological Data Chandan K. Reddy, Mohammad Al Hasan, and Mohammed J. Zaki

Network Clustering Srinivasan Parthasarathy and S.M. Faisal

A Survey of Uncertain Data Clustering Algorithms Charu C. Aggarwal

Concepts of Visual and Interactive Clustering Alexander Hinneburg

Semi-Supervised Clustering Amrudin Agovic and Arindam Banerjee

Alternative Clustering Analysis: A Review James Bailey

Cluster Ensembles: Theory and Applications Joydeep Ghosh and Ayan Acharya

Clustering Validation Measures Hui Xiong and Zhongmou Li

Educational and Software Resources for Data Clustering Charu C. Aggarwal and Chandan K. Reddy


Editors Bio

Charu C. Aggarwal is a Research Scientist at the IBM T. J. Watson Research Center in Yorktown Heights, New York. He completed his B.S. from IIT Kanpur in 1993 and his Ph.D. from Massachusetts Institute of Technology in 1996. His research interest during his Ph.D. years was in combinatorial optimization (network flow algorithms), and his thesis advisor was Professor James B. Orlin. He has since worked in the field of performance analysis, databases, and data mining. He has published over 200 papers in refereed conferences and journals, and has applied for or been granted over 80 patents. He is author or editor of nine books, including this one. Because of the commercial value of the above-mentioned patents, he has received several invention achievement awards and has thrice been designated a Master Inventor at IBM. He is a recipient of an IBM Corporate Award (2003) for his work on bio-terrorist threat detection in data streams, a recipient of the IBM Outstanding Innovation Award (2008) for his scientific contributions to privacy technology, and a recipient of an IBM Research Division Award (2008) for his scientific contributions to data stream research. He has served on the program committees of most major database/data mining conferences, and served as program vice-chairs of the SIAM Conference on Data Mining, 2007, the IEEE ICDM Conference, 2007, the WWW Conference 2009, and the IEEE ICDM Conference, 2009. He served as an associate editor of the IEEE Transactions on Knowledge and Data Engineering Journal from 2004 to 2008. He is an associate editor of the ACM TKDD Journal, an action editor of the Data Mining and Knowledge Discovery Journal, an associate editor of the ACM SIGKDD Explorations, and an associate editor of the Knowledge and Information Systems Journal. He is a fellow of the IEEE for “contributions to knowledge discovery and data mining techniques”, and a life-member of the ACM.

Chandan K. Reddy is an Assistant Professor in the Department of Computer Science at Wayne State University. He received his PhD from Cornell University and MS from Michigan State University. His primary research interests are in the areas of data mining and machine learning with applications to healthcare, bioinformatics, and social network analysis. His research is funded by the National Science Foundation, Department of Transportation, and the Susan G. Komen for the Cure Foundation. He has published over 40 peer-reviewed articles in leading conferences and journals. He received the Best Application Paper Award at the ACM SIGKDD conference in 2010 and was a finalist of the INFORMS Franz Edelman Award Competition in 2011. He is a member of IEEE, ACM, and SIAM.

You may also like...