Xiao Li

xli.jpg

PhD
Department of Computer Science
Western University
London, Ontario N6A 5B7, Canada

Email: Please open Python on your terminal and type “x@y”.replace(“x”, “xli485”).replace(“y”, “csd.uwo.ca”)

Contact: see my schedule

Research Interests

My research interests focus on machine learning and data mining with application to web information mining and information retrieval.

Current Research

Hierarchical Text Classification, Active Learning, Information Retrieval, Topic Modeling and text mining application

Hierarchical Text Classification

Active Learning for Hierarchical Text Classification

Effective Top-Down Active Learning for Hierarchical Text Classification

Mining Academic Topics: Supervised or Unsupervised Way? (working paper)

Active Inference in Partially Observed Topic Hierarchies (working paper)

Collective Hierarchical Text Classification (working paper)

Information Retrieval

Publications

See Papers

Professional Activities

External reviewers of KDD 2013, KDD 2012, ECML 2012, KDD 2011, ICDM 2011, IEEE TKDE, ACM TKDD and ACM TIST.

Certificates

  • Statement of Accomplishment for the online Machine Learning course from the Stanford University under supervision and teaching of Prof. Andrew Ng, December 2011;
  • Statement of Accomplishment for the online Artificial Intelligence course from the Stanford University under supervision and teaching of Prof. Sebastian Thrun and Prof. Peter Norvig, December 2011;
  • Teaching Assistant Training Program certificate, Teaching Support Center, University of Western Ontario, London, ON, Canada, August 2011

Awards

  • First Prize in Data Mining Session in UWO Resarch in Computer Science Conference (UWORCS), 2013
  • First Prize in Data Mining Session in UWO Resarch in Computer Science Conference (UWORCS), 2012
  • First Prize in Software Demo Session in UWO Resarch in Computer Science Conference (UWORCS), 2012

Research Assistant

  • Work at the Data Mining & Business Intelligence Lab From October 2009 to November 2013.

Teaching Assistant

  • CS1032b: Information Systems and Design, 2013 Winter
  • CS1032b: Information Systems and Design, 2012 Winter
  • CS1032b: Information Systems and Design, 2011 Winter
  • CS1026b: Computer Science Fundamentals I, 2010 Winter

Technical Skills

  • Familiar with Apache Nutch and Apache Solr web search platform.
    1. I customize the Nutch for specified crawling and text extraction requirement of SEEU.
    2. I hack the code of Solr to support fantastic searching functionality, such as complex query elevation in hierarchical search engine, fast query suggestion and spell checking.
  • Competent with parallel machine learning system development on Hadoop MapReduce (Java), SHARCNET OpenMPI (C++) and Parallel IPython clusters (Python). The code I have written includes but not limited to:
    1. HTML text feature and anchor feature extraction tool based on Hadoop MapReduce (less than 30 mins to extract features from 600,000 HTML webpage).
    2. Large-scale parallel hierarchical text classification system based on C++ OpenMPI (about 1,200 seconds to train 662 SVM (Support Vector Machine) classifiers on 1.2 million webpages).
    3. Several packages for large-scale data mining experiments and system development based on IPython parallel computing module (my favourite Scientific experimental tool).
  • Competent with Ubuntu cluster system management
    1. I have three years experience in managing the cluster system in Data Mining & Business Intelligence Lab.
    2. I setup the Hadoop and MPI cluster on five standard PCs (each with four 2.4GHz cores and 4GB memory). These hardware and software infrastructure are the basics for both SEE and SEEU search engines.
  • Very strong programming skills in web development, such as HTML, javascript, CSS, jQuery, JSP, Java Servlet and MySQL.
    1. I am the main programmer of SEEU (for both PC and mobile versions).
    2. I create and manage the web portal of Data Mining & Business Intelligence lab.
  • My favorite programming language (in ranking order): Python, Java, C++, C, Javascript.
  • Other programming skill. Sometimes, I do read and write code in Matlab, Perl, C#, VB.

I will be on job market soon. You can find more about me on Linkedin.

Other Funs


QR Code
QR Code people:xiao_li (generated for current page)