Cynthia Rudin, PhD 
 


Associate Research Scientist
Columbia University
Center for Computational Learning Systems
Interchurch Center, 475 Riverside Drive MC 7717
New York, NY 10115

 


Introduction

In March 2007, I joined the Center for Computational Learning Systems at Columbia as an Associate Research Scientist.

My field of interest is statistical learning. The main goal of statistical learning is to characterize points chosen from an unknown probability distribution when given a representative sample from that distribution.

My interests and recent research topics have included boosting, ranking, probabilistic generalization bounds, and using dynamical systems and other tools to analyze the convergence of algorithms in unusual ways. Currently I am working on a project involving manhole event prediction using data from ConEdison. Before my current position, I worked at New York University while on an NSF postdoctoral research fellowship. In the spring of 2004, I received a PhD in Applied and Computational Mathematics at Princeton University, under the supervision of Ingrid Daubechies and Robert Schapire.


Research and Recent Results:

P-norm Ranking

I have been using L_p norms to design ranking algorithms that concentrate on the top portion of the list, where the rest of the list must only be sufficiently well constructed. The user determines how much emphasis is placed at the top. I have also been working with Heng Ji and Ralph Grishman in the Computer Science Department at NYU who are experts in natural language processing. They are specifically interested in using re-ranking algorithms for the NLP task of name tagging in Chinese; we have found that the P-Norm Ranking algorithm performs very well for this task.

"The P-Norm Push: A Simple Convex Ranking Algorithm that Concentrates at the Top of the List,"
Submitted, 2008.
Author: Cynthia Rudin

The P-Norm Push: A Simple Convex Ranking Algorithm that Concentrates at the Top of the List - PDF

"Ranking with a P-Norm Push,"
Proceedings of the Nineteenth Annual Conference on Computational Learning Theory (CoLT), pages 589 - 604, 2006.
Author: Cynthia Rudin

Ranking with a P-Norm Push - PDF

"Re-Ranking Algorithms for Name Tagging,"
In Proc. Human Language Technology conference - North American chapter of the Association for Computational Linguistics annual meeting (HLT-NAACL) Workshop on Computationally Hard Problems and Joint Inference in Speech and Language Processing, 2006.
Authors: Heng Ji, Cynthia Rudin, Ralph Grishman

Re-Ranking Algorithms for Name Tagging - PDF


Ranking

In fall of 2004, I worked with Corinna Cortes, Mehyrar Mohri and Robert E. Schapire on problems related to ranking and boosting. We have three main results: first, we presented a margin-based bound for ranking in a general setting, using L_infinity covering numbers. Second, we produced a Smooth Margin Ranking algorithm which converges to a maximum margin solution. Finally, we gave natural (and surprising) conditions such that AdaBoost will act as a ranking algorithm. In fact, it turns out that AdaBoost (which was not designed for ranking), is asymptotically just as good for ranking as RankBoost is.

"Margin-Based Ranking and Why AdaBoost is a Good Ranking Algorithm,"
Accepted with Minor Revision to Journal of Machine Learning Research, 2008. Will be published under a different title.
Authors: Cynthia Rudin and Robert E. Schapire

Margin-Based Ranking and Why AdaBoost is a Good Ranking Algorithm - PDF

"Margin-Based Ranking and Boosting Meet in the Middle,"
Proceedings of the Eighteenth Annual Conference on Computational Learning Theory (CoLT), pages 63 - 78, 2005.
Authors: Cynthia Rudin, Corinna Cortes, Mehryar Mohri, Robert E. Schapire

Margin-Based Ranking Meets Boosting in the Middle - PDF


Smooth Margin Boosting

In spring 2004, I worked with Robert E. Schapire and Ingrid Daubechies on the convergence of boosting algorithms (a continuation of our work on the dynamics of AdaBoost).

We developed two maximum margin boosting algorithms (Coordinate Ascent Boosting and Approximate Coordinate Ascent Boosting) which are qualitatively similar to AdaBoost (except of course, the smooth margin algorithms maximize the margin, whereas AdaBoost does not always do this). Our algorithms use the "smooth margin" as their objective, which is a differentiable approximation of the true margin. We prove asymptotic convergence and two different types of convergence rate for these algorithms. We also prove a convergence rate for Breiman's arc-gv algorithm.

Also, there is an interesting result about AdaBoost, namely, we derive a direct relationship between AdaBoost's edge values and its asymptotic margin. We call this the "case of bounded edges".

"Analysis of Boosting Algorithms using the Smooth Margin Function"
Annals of Statistics, Volume 35, Number 6, pages 2723-2768, 2007.
Authors: Cynthia Rudin, Robert E. Schapire, Ingrid Daubechies

Analysis of Boosting Algorithms using the Smooth Margin - PDF

"Precise Statements of Convergence for AdaBoost and arc-gv,"
In Proc. AMS-IMS-SIAM Joint Summer Research Conference: Machine Learning, Statistics, and Discovery 131-145, 2007.
Authors: Cynthia Rudin, Robert E. Schapire, and Ingrid Daubechies

Precise Statements of Convergence for AdaBoost and arc-gv - PDF

"Boosting Based on a Smooth Margin,"
Proceedings of the Seventeenth Annual Conference on Computational Learning Theory, (CoLT), 2004. Copyright Springer.
Authors: Cynthia Rudin, Robert E. Schapire, Ingrid Daubechies

Boosting Based on a Smooth Margin - PDF


Dynamics of AdaBoost

During the fall of 2004, I had been studying AdaBoost, specifically to understand its convergence properties. In doing so, I found that AdaBoost, when studied as a dynamical system, exhibits very rich dyamical behavior such as stable cycles. Using a small toolbox of techniques for studying dynamical systems, I was able to resolve some well-studied open questions concerning AdaBoost's convergence properties. For instance, I have proved that AdaBoost may converge to a classifier with margin significantly below the maximum value; this was contrary to the widely believed conjecture that AdaBoost always converges to a maximum margin solution. (Please note that the NIPS paper below does not have this main result, the NIPS paper came before I really solved it.)

"The Dynamics of AdaBoost: Cyclic Behavior and Convergence of Margins," Journal of Machine Learning Research, 5 (Dec): 1557 - 1595, 2004.
Authors: Cynthia Rudin, Ingrid Daubechies, Robert E. Schapire

The Dynamics of AdaBoost: Cyclic Behavior and Convergence of Margins - PDF

"On the Dynamics of Boosting," Advances in Neural Information Processing Systems 16, 2003.
Authors: Cynthia Rudin, Ingrid Daubechies, and Rob Schapire

On the Dynamics of Boosting - PDF


Other Papers

"Visualization of Manhole and Precursor-Type Events for the Manhattan Electrical Distribution System," Workshop on GeoVisualization of Dynamics, Movement and Change, 11th AGILE International Conference on Geographic Information Science, 2008.
Authors: Haimonti Dutta, Cynthia Rudin, Becky Passonneau, Fred Seibel, Nandini Bhardwaj, Axinia Radeva, Zhi An Liu, Steve Ierome, Delfina Isaac.

Visualization of Manhole and Precursor-Type Events for the Manhattan Electrical Distribution System," - PDF

"Arabic Morphological Tagging, Diacritization, and Lemmatization Using Lexeme Models and Feature Ranking," The 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL/HLT)
Authors: Ryan Roth, Owen Rambow, Nizar Habash, Mona Diab, and Cynthia Rudin

Arabic Morphological Tagging, Diacritization, and Lemmatization Using Lexeme Models and Feature Ranking - PDF

"Stability of Learning algorithms."
Computer Science ArXiV
Author: Cynthia Rudin

A Different Type of Convergence for Statistical Learning Algorithms - PDF

"Equilibrium Island Arrays in Strained Solid Films,"
Journal of Applied Physics, November 15 1999 - Volume 86, Issue 10, pp. 5530-5536.
Authors: Cynthia Rudin and Brian Spencer.

Equilibrium Island Ridge Arrays in Strained Solid Films - PDF



Links:

PACM    Program for Applied and Computational Mathematics, Princeton University

University at Buffalo    SUNY Buffalo, my undergraduate institution.