Funny Data Mining

DBLP Mining

ICML

NIPS

Sample Code

  • mining_dblp.zip Python code used for parsing DBLP web page. The python program crawls DBLP web page and extracts the conference paper titles and author names.First, call parseDBLP(startURL, conferenceName) to crawl web page; Second, call makeTitleTable(conferenceName, yearFrom, yearTo) to build title count table and call print_keyword_count_html_table(table, vocabulary, n) to build Html table which shows the word occurring more than n counts; Third, call makeAuthorTable(conferenceName, yearFrom, yearTo) to build author-paper table and call print_author_count_html_table(data, allauthors, n) to build Html table which shows the author who publishs more than n papers.

Search and Rank Wikipedia


QR Code
QR Code people:xiao_li:funny_dm (generated for current page)