2009 UC San Diego Data Mining Contest

The winners have been identified; check the results tab for information. Also, we have reopened the site for continued participation.

The History of the UCSD Student Data Mining Contest

Since 2004, UCSD and FICO have hosted this data mining contest with the goal of providing students an opportunity to test out their data mining skills.

2009 Contest

In the sixth UCSD/FICO datamining contest, international teams made their strongest showing yet, taking all but one of the twelve prizes. We had competitors from all inhabited continents except Africa, comprising a total of 301 teams, with 151 international teams. In both tasks, contestants were asked to predict the presence of an anomaly in e-commerce transaction data. The data in the "hard" task lacked some of the structure present in the "easy" task's data.

The winners were:

Campus Prizes
Division Contest Winner Campus Prize
Undergrads Classification (Easy) RocketScience - Jan Hendrik Hosang RWTH Aachen University 1st Place
Undergrads Classification (Easy) Weka23 - Johannes Laudenberg RWTH Aachen University Joint 2nd Place
Undergrads Classification (Easy) Overfittrs - Pavlo Golik RWTH Aachen University Joint 2nd Place
Grads/Postdocs Classification (Easy) west - Kohei Hayashi, Masayoshi Nakamura Nara Institute of Science and Technology 1st Place
Grads/Postdocs Classification (Easy) Bayesline - Christian Buck, Tobias Weyand RWTH Aachen University 2nd Place
Grads/Postdocs Classification (Easy) Charong - Xiao Guo Beijing University of Technology 3rd Place
Undergrads Classification (Hard) RocketScience - Jan Hendrik Hosang RWTH Aachen University 1st Place
Undergrads Classification (Hard) lhl - Honglei Liu Nanjing University of Science and Technology 2nd Place
Undergrads Classification (Hard) nj88 - Qinghua Xu Nanjing University of Science and Technology 3rd Place
Grads/Postdocs Classification (Hard) weka5 - Quan Sun The University of Waikato 1st Place
Grads/Postdocs Classification (Hard) haveaniceday - Mark Nelson Georgia Institute of Technology 2nd Place
Grads/Postdocs Classification (Hard) azida - Jeong-Min Yun Pohang University of Science and Technology 3rd Place

2008 Contest

In its fifth year, the contest attacted international attention, with competition from Europe, Asia, and North America. As in the past, the UC schools made a strong showing. Contestants were asked to classify data from a scientific experiment. The first task was to maximize accuracy given a fully labeled training data set. The second task was to maximize the F1-score given a partially labeled training set.

The winners were:

Campus Prizes
Division Contest Winner Campus Prize
Undergrads Standard Classification Pleske - Lovro Subelj U of Ljubljana 1st Place
Undergrads Standard Classification decaff - Chuan Sheng Foo Stanford 2nd Place
Undergrads Standard Classification Bureki - Ruben Sipos U of Ljubljana 3rd Place
Grads/Postdocs Standard Classification Tol - Christopher D. Wassman UC Irvine 1st Place
Grads/Postdocs Standard Classification Seafarer - Nikolaos Trogkanis UC Berkeley 2nd Place
Grads/Postdocs Standard Classification AllYourBayes - Todd Johnson, Drew Frank, David Orendorff, Julien Neel UC Irvine 3rd Place
Undergrads Positive-only Classification Pleske - Lovro Subelj U of Ljubljana 1st Place
Undergrads Positive-only Classification WarsawTeam - Marcin Nowak-Przygodzki Warsaw U 2nd Place
Undergrads Positive-only Classification Pikapolonice - Lan Zagar, Mitja Trampus, Aljaz Kosmerlj, Tadej Janez U of Ljubljana 3rd Place
Grads/Postdocs Positive-only Classification Kdd - Ka Chung Sia UC Los Angeles 1st Place
Grads/Postdocs Positive-only Classification Miner - Chen Tierui UC Santa Barbara 2nd Place
Grads/Postdocs Positive-only Classification GoldFinger - Hetal Thakkar, Barzan Mozafari UC Los Angeles 3rd Place

2007 Contest

In its forth year the contest continued to attract competition from across the US, with schools ranging from Oregon State University, Stanford, and Harvey Mudd College to the University of Cincinnati, the University of Central Florida, SUNY - Univ. at Buffalo, and the Medical University of South Carolina. As always, the UC campuses had a strong presence. The contest was based on San Diego housing data, provided by DataQuick. Contestants were asked to predict whether or not a particular property would refinance, and how much it would sell for.

The winners were:

Winners of the Refinance Prediction Contest
Division Winner Campus Prize
Undergrads ImaginaryReality - Christopher Berger UC Santa Cruz 1st Place
Undergrads Franleo - Kei Shun Ma UC San Diego 2nd Place
Undergrads Mebapa - Sean(Hyun) Park UC San Diego 3rd Place
Undergrads Teamwang - Fangwen G. Wang UC Los Angeles 4th Place
Undergrads RamRod - Robin Fago UC San Diego 4th Place
Grads/Postdocs ISLab - Rongsheng Gong, Hua Li University of Cincinnati 1st Place
Grads/Postdocs Yuning - Ka Cheung Sia UC Los Angeles 2nd Place
Grads/Postdocs Tol - Christopher D. Wassman UC Irvine 3rd Place
Grads/Postdocs Century21 - Kung-Hua Chang UC Los Angeles 4th Place
Grads/Postdocs Ismene - Nikolaos Trogkanis UC San Diego 5th Place
Grads/Postdocs Hello2007 - Bo Jin Medical University of South Carolina 6th Place
Winners of the Sale Price Prediction Contest
Division Winner Campus Prize
Undergrads Mebapa - Sean (Hyun) Park UC San Diego 1st Place
Undergrads Teamwang - Fangwen G. Wang UC Los Angeles 2nd Place
Undergrads decaff - Chuan Sheng Foo Stanford 3rd Place
Undergrads ImaginaryReality - Christopher Berger UC Santa Cruz 4th Place
Grads/Postdocs Century21 - Kung-Hua Chang UC Los Angeles 1st Place
Grads/Postdocs ISLab - Rongsheng Gong, Hua Li University of Cincinnati 2nd Place
Grads/Postdocs Ismene - Nikolaos Trogkanis UC San Diego 3rd Place
Grads/Postdocs Scratch - Rajat Raina Stanford 4th Place
Grads/Postdocs Yuning - Ka Cheung Sia UC Los Angeles 5th Place
Grads/Postdocs Hello2007 - Bo Jin Medical University of South Carolina 6th Place
Grads/Postdocs PinPals - Paul Ruvolo, Ravi Mody, Adam Bickett UC San Diego 7th Place

2006 Contest

In the third year, we went national with the competition. We had 58 teams compete from a number of American universities: all of the UC campuses, U Penn, Oregon State, U Texas, U Michigan, Rutgers, Suny Binghamton, Central Florida, OSU (Oklahoma and Oregon), Stanford, MIT and Georgia State.

The contest focused on text processing: document classification and word prediction. The classification task was hotly contested with a number of teams finishing very close to the top score. Two undergraduate teams (Bussewaffenlos and decaff) finished second and third overall beating out all but one graduate team (classified). UC Los Angeles dominated the Word Prediction task finishing with three of the top four teams overall (ShowMeThePrizes, GoldSeekers, and MoneyMoneyMoney). Team decaff (Chuan Sheng Foo) finished an impressive third to take the top undergraduate award.

The winners were:
Campus Prizes
Division Contest Winner Campus Prize
Undergrads Document Classification Bussewaffenlos - Jenan Wise, Thomas Finsterbusch UC San Diego 1st Place
Undergrads Document Classification decaff - Chuan Sheng Foo Stanford 2nd Place
Undergrads Document Classification TodayToMine - Tom Maddock, Grace Lin UC San Diego 3rd Place
Grads/Postdocs Document Classification Classified - Kilian Weinberger, John Blitzer, U Penn 1st Place
Grads/Postdocs Document Classification Jellorock - Jerry Ye UC Berkeley 2nd Place
Grads/Postdocs Document Classification SuperMiner - Zhiqiang Bi UC Santa Barbara 3rd Place
Undergrads Word Prediction decaff - Chuan Sheng Foo Stanford 1st Place
Undergrads Word Prediciton ShodaMinedUrBiz - Shashwati Kasetty, Michael Rivera, Carlos Rendon, Tim Garcia UC Riverside 2nd Place
Undergrads Word Prediction SSSGuard - Thomas Allen, Mark Pirkle, Lucas (Jai) Clary UC San Diego 3rd Place
Grads/Postdocs Word Prediction ShowMeThePrizes - Kung-Hua Chang, Yong Kyun Kwon UC Los Angeles 1st Place
Grads/Postdocs Word Prediction Gold Seekers- Yijian Bai, Hetal Thakkar, Ka Cheung Sia, Alexandros Noulas UC Los Angeles 2nd Place
Grads/Postdocs Word Prediction MoneyMoneyMoney - Justin Liu, Ruey-Lung Hsiao UC Los Angeles 3rd Place

2005 Contest

In the second year, we opened the contest up to all of the University of California campuses (Berkeley, Davis, Irvine, Merced, Riverside, San Francisco, San Diego, Santa Barbara, & Santa Cruz). The data set was a set of time series where each time series could be thought of as an account profile. The contestants had to predict if and when the account changed from a 'good' state to a 'bad' state. We had over 90 teams representing 8 UC campuses participate!

The winners were:
Campus Prizes
Division Contest Winner Campus Prize
Undergrads Classification Viking - Gustav Lindstram Santa Barbara Grand Prize
Undergrads Classification Sufian - Sufian Rhazi Irvine Campus Prize
Undergrads Classification pwncs - Sherman Lee Berkeley Campus Prize
Undergrads TimeSeries MineYourOwnBiz - Shashwati Kasetty, Brandon Sterne, Michael Rivera, Adam Fomotor Riverside Grand Prize
Undergrads TimeSeries Viking - Gustav Lindstram Santa Barbara Campus Prize
Undergrads TimeSeries pwncs - Sherman Lee Berkely Campus Prize
Undergrads Honorable Mention bam4k - Alexander Bellanca Santa Barbara  
Grad/Postdoc Classification ShowMeTheMoney - Yong Kyun Kwon, Kung-Hua Chang, Laurie O Connor, Teresa Breyer Los Angeles Grand Prize
Grad/Postdoc Classification Closer - Manos Pontikakis, Vesselin Diev, Vadim von Brzeski, Dave Helmbold Santa Cruz Campus Prize
Grad/Postdoc Classification Trex - Jianlin Cheng Irvine Campus Prize
Grad/Postdoc Classification TOTEM - Haifeng Li Riverside Campus Prize
Grad/Postdoc Classification TeamVenture - Ben Lee, Sean Wu SantaBarbara Campus Prize
Grad/Postdoc TimeSeries Closer - Manos Pontikakis, Vesselin Diev, Vadim von Brzeski, Dave Helmbold Santa Cruz Grand Prize
Grad/Postdoc TimeSeries Trex - Jianlin Cheng Irvine Campus Prize
Grad/Postdoc TimeSeries ShowMeTheMoney - Yong Kyun Kwon, Kung-Hua Chang, Laurie O Connor, Teresa Breyer Campus Prize Campus Prize
Grad/Postdoc TimeSeries KazmaKurek - Orhan Camoglu, Tolga Can, Ahmet Bulut Santa Barbara Campus Prize
Grad/Postdoc TimeSeries Digger- WeeSan Lee Riverside Campus Prize

2004 Contest

In the our first year, a pilot contest was created for the UC San Diego community. The contest data set was mass spectrometry measurements from the emissions of cars and trucks. The first task was to predict if the emissons came from a car or a truck. The second task was to predict the negatively-charged mass spectrometry measurements given the positively-charged measurements. We had 10 undergraduate and 20 graduate teams compete.

The winners of the pilot competition were:

Division Contest Winner Campus
Undergrads Classification Aleksandr Simma - "bsdfish" UC San Diego
Undergrads Feature Prediction Aleksandr Simma - "bsdfish" UC San Diego
Grad/Postdoc Classification Thomas Rebotier - "TEASERrebotier" UC San Diego
Grad/Postdoc Feature Prediction Thomas Rebotier - "TEASERrebotier" UC San Diego
Grad/Postdoc Honorable Mention Douglas Turnbull - "SmogIsBad" UC San Diego

Thanks to FICO for their continuing support!