BigMUD 2013: The First International Workshop on Mining and Understanding from Big Data


In conjunction with

IEEE International Conference on Data Mining (ICDM 2013)

Dallas, Texas.  December 8-11, 2013

(Distinguished papers presented at the workshop, after further extension and revision, will appear in a Special Issue of the SCI-indexed Journal - Journal of Computer Science and Technology (JCST), Springer -)


Big data refers to datasets that exceed the competence of commonly used IT systems in terms of processing space and/or time. Traditionally, massive data are mostly produced in scientific fields such as astronomy, meteorology, genomics physics, biology, and environmental research. Due to the rapid development of IT technology and the consequent decrease of cost on collecting and storing data, big data has been generated from almost every industry and sector as well as governmental department, including retail, finance, banking, security, audit, electric power, healthcare, to name a few. Recently, big data over the Web (big Web data for short), which includes all the context data, such as, user generated contents, browser/search log data, deep web data, etc., have attracted extensive interests, as these context data and their analyses help us to understand what is happening in real life. This can help to enable new ways for improving user experience by providing more accurate predictions and recommendations thus creating a personalized smarter internet.

Currently, big data is often on the order of petabytes and even exabytes. However, big data has become bigger and bigger not only in its size, but also in its growth rate and variety. The volume of big data often grows exponentially or even in rates that overwhelm the well-known Moore's Law. Meanwhile, big data has been extended from traditional structured data into semi-structured and completely unstructured data of different types, such as text, image, audio, video, click streams, log files, etc. Moreover, big data is often internally interconnected and thus form complex data/information networks.

Although big data can offer us unprecedented opportunities, they also pose many grand challenges. Due to the massive volume and inherent complexity, it is extremely difficult to store, aggregate, manage, and analyze big data and finally mine valuable information/knowledge from the complex data/information networks. Therefore, in the presence of big data, the models, algorithms and methods for traditional data mining become no longer effective and efficient. For instance, similarity learning, upon which various similarity-based tasks (e.g., ranking and clustering) can be launched, is extremely challenging for real applications with big data due to their typical features such as the data being heterogeneous, time-evolving, sparse and noisy. On the other hand, some data is generated exponentially or super-exponentially in a streaming manner. Therefore, how to carry out real-time analysis on, and deep mining and understanding from big data so as to obtain dynamical and incremental information/knowledge, is another grand challenge. In general, at the era of big data, it is expected to develop new models, algorithms, methods, and even paradigms for mining, analyzing, and understanding big data.

This workshop aims to provide a networking venue that will bring together scientists, researchers, professionals, and practitioners from both industry and academia and from different disciplines (including computer science, social science, network science, etc.) to exchange ideas, discuss solutions, share experiences, promote collaborations, and report state-of-the-art research results and technological innovations on various aspects of mining and understanding from big data.

Scope and Topics

The topics of interest include, but are not limited to:

Important Dates

Submission Deadline: August 17, 2013 11:59:59 PM
Authors Notification:September 24, 2013
Workshop Date:December 8, 2013


8:30 - 10:30Session 1: Welcome, Invited Talk
8:30 - 8:35Chairs' Welcome
8:35 - 9:15Jie Tang: SAE: Social Analytic Engine for Large-scale Networks (Invited Talk)
9:15 - 10: 00Session 2: Papers
9:15 - 9:35Gavin Smith, James Goulding, and Duncan Barrack: Towards optimal symbolization for time series comparisons
9:35 - 9:55Paolo Boldi and Sebastiano Vigna: In-Core Computation of Geometric Centralities with HyperBall: A Hundred Billion Nodes and Beyond
10:00 - 10:30Coffee break
10:30 - 12:30Session 3: Papers
10:30 - 10:50Xianmang He: A Study on Privacy Preservation for Multi-user and Multi-granularity
10:50 - 11:20Ali Jalali, Santanu Kolay, Peter Foldes, and Ali Dasdan: Scalable Audience Reach Estimation in Real-time Online Advertising
11:20 - 11:40Guangchao Yao, Yao Zheng, Limin Xiao, Li Ruan, Yongnan Li, and Zhenzhong Zhang: GPU-Accelerated Query by Humming Using Modified SPRING Algorithm
11:40 - 11:55Discussion
11:55 - 12:00Closing remarks

Paper Submission Guideline

All papers need to be submitted electronically through the conference website with PDF format. The materials presented in the papers should not be published or be under submission elsewhere. Each paper is limited to 8 pages including figures and references and follows the IEEE ICDM format requirements.

Once accepted, the paper will be included into the conference proceedings published by IEEE Computer Society Press (indexed by EI). At least one of the authors of any accepted paper is requested to register the paper at the workshop.

Distinguished papers presented at the workshop, after further extension and revision, will appear in a Special Issue of the SCI-indexed Journal, Journal of Computer Science and Technology (JCST), Springer.

Workshop Co-Chairs

Xueqi ChengInstitute of Computing Technology, CAS, China,
Alvin ChinNokia, China,
Charles X. LingWestern University, Canada, cling@
Fei WangIBM T. J. Watson Research Center, USA,

Organizing committee

Enhong ChenUniversity of Science and Technology of China
Guanling ChenUniversity of Massachusetts Lowell, USA
Peng CuiTsinghua University, China
Irwin KingChinese University of Hong Kong, China
Jilei TianNokia, China
Jun WangIBM T.J. Watson Research Center

Program Committee

(Some PC members are still pending)

Lada AdamicUniversity of Michigan, USA
Joyce HoUniversity of Texas at Austin, USA
Yoshiharu IshikawaNagoya University, Japan
Xiaolong JinChinese Academy of Sciences, China
Daeyoung KimKAIST, Korea
Jingu KimNokia, USA
Ronny LempelYahoo!, Israel
Jure LescovecStanford University, USA
Tie-Yan LiuMicrosoft Research Asia, China
Qiaozhu MeiUniversity of Michigan, USA
Luo SiPurdue University, USA
Jie TangTsinghua University, China
Dacheng TaoUniversity of Technology Sydney, Australia
Xiang WangIBM T.J. Watson Research Center, USA
Rong YanFacebook, USA
Zhiwen YuNorthwestern Polytechnic University, China
Dan ZhangFacebook, USA
Jianwen ZhangMicrosoft Research, USA
Vincent ZhengAdvanced Digital Sciences Center, Singapore
Jiayu ZhouArizona State University, USA

Contact Us

Should you have any query, please do not hesitate to contact us at