Full Professor, Computer Science
Founder & Director, Database & Data Mining Lab
E-mail: kleung [AT] cs.umanitoba.ca
Office Phone: +1 (204) 474-8677
Office Fax: +1 (204) 474-7609
Postal: E2-445 EITC, 75 Chancellor's Circle, Winnipeg, MB, R3T 2N2
Office: E2-403 EITC
Lab: E2-520 EITC
B.Sc. (Honours), The University of British Columbia;
M.Sc., The University of British Columbia;
Ph.D., The University of British Columbia.
Data mining refers to the search for previously unknown
patterns and relationships that might be embedded in stored data.
Most of the existing data mining algorithms treat the mining process as an
impenetrable black-box, where users are not allowed to express their focus
As a result, these unconstrained mining algorithms can yield numerous
patterns that do not make sense
(e.g., "customers who buy diapers also buy beer") or
that are not interesting to users.
To this end, we are developing a human-centered exploratory mining
(i) enables human analysts/users to impose constraints to focus the search,
(ii) avoids irrelevant and time-consuming computation.
Such an algorithm shows an excellent division of labor,
where the computer carries out the mechanical aspect of the work
(e.g., the counting and searching)
and the human performs the intelligent aspect of the work
(e.g., the abstract thinking and observation).
- Data mining and analysis (including data analytics, data science & business intelligence solutions)
- Big data, databases (including image databases), data management, and data warehousing
- Data visualization and visual analytics
- Health informatics and electronic health
- Web technology and services, as well as social computing & social network analysis
It is understood that data mining is supposed to be an iterative and
exploratory process. Hence, we
not only allow users to impose certain constraints on the mining process,
but also allow users
to change these constraints dynamically in the middle of the computation.
Towards the development of a practical environment of this human-centered
exploratory mining algorithm, we are developing techniques to
support dynamic mining.
To enhance the performance of our dynamic mining algorithm,
we have proposed the following novel structures:
(i) the segment support map to facilitate scalable mining, and
(ii) the OSSM to optimize frequency counting.
With respect to my research interest on image databases, the motivation is
As the number of on-line digital images has increased rapidly,
the development of efficient and effective retrieval of images is necessary.
Many existing image database systems support whole-image queries,
which require users to specify the contents of the whole images
to be retrieved.
However, users may only remember or care about some, but not all,
portions of the images (i.e., subimages) they have seen before.
Techniques for handling subimage queries of arbitrary size are therefore
Unfortunately, not many image database management systems
can handle these subimage queries.
Among the systems that can deal with subimage queries of arbitrary size,
multiscale similarity matching is rarely used.
To this end, we developed techniques based on multiscale
similarity matching to handle subimage queries of arbitrary size,
and applied the techniques in large image databases.
Affiliated Web Pages
Carson Kai-Sang Leung, Mark Anthony F. Mateo, and Dale A. Brajczuk.
A Tree-Based Approach for Frequent Pattern Mining from Uncertain Data.
In Proceedings of 12th Pacific-Asia Conference on Knowledge
Discovery and Data Mining (PAKDD-08),
Takashi Washio et al. (Editors),
pages 653-661. Osaka, Japan, May 2008.
Carson Kai-Sang Leung, Quamrul I. Khan, and Tariqul Hoque.
CanTree: A Tree Structure for Efficient Incremental Mining of Frequent
In Proceedings of the Fifth IEEE International Conference on
Data Mining (ICDM-05),
Jiawei Han, Benjamin W. Wah, Vijay Raghavan, Xindong Wu, and
Rajeev Rastogi (Editors),
pages 274-281. Houston, TX, USA, November 2005.
An extension appears in
Knowledge and Information Systems: An International Journal
Volume 11, Issue 3, pages 287-312, April 2007.
Carson Kai-Sang Leung and Wookey Lee.
Efficient Update of Data Warehouse Views with Generalised
Referential Integrity Differential Files.
In Proceedings of the 23rd British National Conference on
Databases (BNCOD 23),
David Bell and Jun Hong (Editors),
pages 199-211. Belfast, Northern Ireland, UK, July 2006.
Laks V.S. Lakshmanan, Carson Kai-Sang Leung, and Raymond T. Ng.
Efficient Dynamic Mining of Constrained Frequent Sets.
ACM Transactions on Database Systems (TODS),
Volume 28, Issue 4, pages 337-389, December 2003.
Carson Kai-Sang Leung, Raymond T. Ng, and Heikki Mannila.
OSSM: A Segmentation Approach to Optimize Frequency Counting.
In Proceedings of the 18th International Conference on
Data Engineering (ICDE 2002),
Rakesh Agrawal, Klaus Dittrich, and Anne H.H. Ngu (Editors),
pages 583-592. San Jose, CA, USA, February/March 2002.
Carson Kai-Sang Leung.
Data Mining in SQL.
Research Report, IBM Centre for Advanced Studies (Toronto) &
The University of British Columbia, August 2000.
Carson Kai-Sang Leung.
Evaluation of Data Mining Opportunities at Workers' Compensation
Research Report, Workers' Compensation Board of British Columbia &
The University of British Columbia, November 1998.
Kai-Sang Leung and Raymond Ng.
Multiscale Similarity Matching for Subimage Queries of Arbitrary
In Visual Database Systems 4,
Yannis Ioannidis and Wolfgang Klas (Editors),
London, UK: Chapman & Hall, 1998.
Faculty of Science:
Faculty of Graduate Studies:
Database and Data Mining Laboratory:
Institute of Industrial Mathematical Sciences (IIMS):
TRTech / TRLabs:
The University of British Columbia:
UBC Department of Computer Science:
UBC Database Systems Laboratory:
Personal Web Page at UBC: