The Ninth ACM SIGKDD International Conference on
Knowledge Discovery and Data Mining
Washington, DC, USA August 24 - 27, 2003

For Authors

KDD 2003 >> Program >> Invited Speakers


KDD 2003 - Invited Speakers

Jim Gray (Microsoft Research)

On-Line Science: The World-Wide Telescope as a Prototype for the New Computational Science

Daphne Koller (Stanford University)

Statistical Learning from Relational Data

Andreas S. Weigend (Chief Scientist, Amazon.com)

Analyzing Customer Behavior at Amazon.com

Jim Gray (Microsoft Research)
Title: On-Line Science: The World-Wide Telescope as a Prototype for the New Computational Science


Computational science has historically meant simulation; but, there is an increasing role for analysis and mining of online scientific data. As a case in point, half of the world's astronomy data is public. The astronomy community is putting all that data on the Internet so that the Internet becomes the world's best telescope: it has the whole sky, in many bands, and in detail as good as the best 2-year-old telescopes. It is useable by all astronomers everywhere. This is the vision of the virtual observatory -- also called the World Wide Telescope (WWT). As one step along that path I have been working with the Sloan Digital Sky Survey (especially Alex Szalay of Johns Hopkins) and CalTech to federate their data in web services on the Internet, and to make it easy to ask questions of the database (see http://skyserver.sdss.org). This talk explains the rationale for the WWT, discusses how we designed the database, and talks about some data mining tasks. It also describes computer science challenges of publishing, federating, and mining scientific data, and argues that XML web services are key to federating diverse data sources.


Jim Gray is part of Microsoft's research group. His work focuses on databases and transaction processing. Jim is active in the research community, is an ACM, NAE, NAS, and AAAS Fellow, and received the ACM Turing Award for his work on transaction processing. He edits of a book series on data management, and is active in building online databases like http://terraService.Net/ and http://skyserver.sdss.org. Jim's home page is http://research.microsoft.com/~Gray/

Daphne Koller (Stanford University)
Title: Statistical Learning from Relational Data


Much of the data in the world is relational in nature, involving multiple objects, related to each other in a variety of ways. Examples include both structured databases such as customer transaction data, semi-structured data such as hyperlinked pages on the world-wide web or networks of interacting genes, and unstructured data such as text. In this talk, I will describe a statistical framework for learning from relational data. The approach is based on probabilistic models, which have been applied with great success to a variety of machine learning tasks. Generally, this framework has been applied to data represented as fixed-length attribute-value vectors, or to sequence data. I will describe the language of probabilistic relational models (PRMs), which extend probabilistic graphical models with the expressive power of object-relational languages. PRMs model the uncertainty over the attributes of objects in the domain as well as uncertainty over the existence of relations between objects. I will present techniques for automatically learning PRMs directly from a relational data set, and applications of these techniques to various tasks, such as: collective classification of an entire set of related entities; clustering a set of linked entities into coherent groups; and even predicting the existence of links between entities. The talk will demonstrate the applicability of the techniques on several domains, such as web data and biological data.


Daphne Koller received her PhD from Stanford University in 1994. After a two-year postdoc at Berkeley, she returned to Stanford, where she is now an Associate Professor in the Computer Science Department. Her main research interest is in creating large-scale systems that reason and act under uncertainty, using techniques fro decision theory and economics. Daphne Koller is the author of over 70 refereed publications, which have appeared in AI, theoretical computer science, and economics venues. She was the co-chair of the recent UAI 2001 conference, has served on numerous program committes, and as associate editor of the Journal of Artificial Intelligence Research and of the Machine Learning Journal. She was awarded the Arthur Samuel Thesis Award in 1994, the Sloan Foundation Faculty Fellowship in 1996, the ONR Young Investigator Award in 1998, the Presidential Early Career Award for Scientists and Engineers (PECASE) in 1999, and the IJCAI Computers and Thought Award at the IJCAI 2001 conference.

Andreas S. Weigend (Chief Scientist, Amazon.com)
Title: Analyzing Customer Behavior at Amazon.com


The first part of the talk gives an overview of the different kinds of data available at Amazon.com, emphasizing that data mining needs to drive actions such as emails, coupons, and recommendations of products, product groups, or site features. The scope of the actions ranges from the individual customer, over pre-computed customer segments, to the entire customer base.

The second part presents joint work with Bruce D'Ambrosio (Cleverset, Inc.) on relational probabilistic models for customer behavior, both for discovering static customer attributes, and for dynamically predicting the intention of the customer and the outcome of a session.

The third part outlines current research problems, such as modeling and eventually influencing the long-term behavior of customers. In addition to the importance of machine learning, it shows the central role principles of behavioral economics, judgment and decision making play in computational marketing.


Andreas S. Weigend is the Chief Scientist at Amazon.com where he is responsible for research in machine learning and computational marketing. Applications include predicting user intentions and modalities in real time, and measuring and optimizing the long-term effects of promotions.

He also teaches at Stanford and in the executive program at CEIBS (China Europe International Business School. Shanghai). He previously held full-time faculty positions at New York University's Stern School of Business and at the University of Colorado at Boulder. He has published more than one hundred scientific papers and co-authored six books.

His entrepreneurial career includes Weigend Associates LLC (www.weigend.com), offering consulting services in the areas of data mining, behavioral analytics and financial trading models, as well as two startups, MoodLogic and ShockMarket.

He received his Ph.D. from Stanford University in physics and was a researcher at Xerox PARC (Palo Alto Research Center) and the Santa Fe Institute. He majored in electrical engineering, physics, and philosophy at Karlsruhe University, Bonn University (Germany), and at Trinity College, Cambridge (U.K.).

Webmaster: Osmar R. Zaļane
Last updated: June 16, 2003