|
Tutorial Sessions
All
tutorials are free to registered conference attendees of
all conferences held at WOLDCOMP'09. Those who are
interested in attending one or more of the tutorials are
to sign up on site at the conference registration desk
in Las Vegas.
A
complete & current list of WORLDCOMP Tutorials
can be found
here.
In addition to
tutorials at other conferences, DMIN'09
provides a set of tutorials dedicated to Data Mining
topics. The 2007 key tutorial was given
by Prof. Eamonn
Keogh on Time Series Clustering. The 2008 key tutorial
was presented by Mikhail Golovnya (Senior Scientist,
Salford Systems, USA) on Advanced Data Mining
Methodologies.
This year DMIN will
provide the following tutorials:
Tutorial A |
Organizer: |
Nitesh V. Chawla,
University of Notre Dame, USA |
|
Topic: |
Data Mining with
Sensitivity to Rare Events and Class Imbalance |
Webpage |
http://www.cse.nd.edu/~nchawla/ |
Date & Time |
July 13, 2009
(6:00pm - 9:00pm) |
Location |
Gold Room |
Description |
Recent years brought increased interest in
applying data mining techniques to difficult 'real-world'
problems, many of which are characterized by
imbalanced learning data, where at least one
class is much rarer relative to others. Examples
include (but are not limited to): fraud/intrusion
detection, risk management, medical diagnosis/monitoring,
bioinformatics, text categorization and
personalization of information. The problem of
imbalanced data is also often associated with
asymmetric costs of misclassifying elements of
different classes. Additionally the distribution
of the test data may differ from that of the
learning sample and the true misclassification
costs may be unknown at learning time.
Predictive accuracy, a popular choice for
evaluating performance of a classifier, will not
be appropriate when the data is imbalanced and/or
the costs of different errors vary markedly.
This tutorial will introduce the problem of
class imbalance, address the scope of solutions
available, present and contrast the appropriate
metrics for evaluating performance, and discuss
the applications with case studies. |
Short Bio |
Nitesh Chawla is
an Assistant Professor in the Department of
Computer Science and Engineering at the
University of Notre Dame. He directs the Data
Inference Analysis and Learning Lab (DIAL) and
co-directs the Interdisciplinary Center of the
Network Science and Applications (iCenSA) at
Notre Dame. His research is primarily focused on
machine learning, data mining, and social and
dynamic networks. His work has led to
applications in various domains including
biology, medicine, finance, security, social
science, fraud detection, intrusion detection,
and text categorization. He is on the editorial
board of IEEE Transactions on Systems, Man and
Cybernetics Part B. He has received various
awards and acknowledgements. He received the NAE
FIE New Faculty Fellowship in 2005. His current
research is supported form NSF, DOD, NWICG, NIJ,
and industry sponsors. |
Tutorial B |
Organizer: |
Peter Geczy,
National Institute of Advanced Industrial
Science and Technology (AIST), Japan |
|
Topic: |
Emerging Human-Web
Interaction Research |
Date & Time |
July 14, 2009
(6:00 – 8:00pm) |
Location |
Ballroom 1 |
Description |
Abstract:
World wide web has evolved from its earlier
static form to an interactive multimedia
environment. Richness of interactions is rapidly
approaching that of the conventional stand-alone
applications. Human interactivity with web-based
environments has been gaining increasing
importance in both web research and e-commerce.
Mining and exploring human-web interactions
bring numerous challenges as well as
opportunities. We will probe into the processes
and methods of human-web interaction research
ranging from data acquisition techniques,
throughout analytics, to applications.
Accounting for the latest advances in the field,
we will project the prospective future trends.
Objective:
The primary objective of the tutorial is to
provide clear, yet reasonably comprehensive,
overview of the underlying principles, current
approaches, and potential future trends.
Knowledge of the state-of-the-art in human-web
interaction research should be beneficial to a
wide spectrum of individuals studying, utilizing,
designing, and/or managing web-based information
systems.
Audience:
The tutorial aims to approach a broad audience
including, but not limited to:
- Students and Educators
- Academics and Researchers
- Practitioners and Managers
The topic shall
be presented in an accessible and intuitive
manner without extensive technical details.
The material/slides will be provided after the
conference as pdf-file on this website.
|
Short Bio |
Dr. Peter Geczy is
a senior scientist at The National Institute of
Advanced Industrial Science and Technology (AIST).
He also held positions at The Institute of
Physical and Chemical Research (RIKEN) and The
Research Center for Future Technology. His
interdisciplinary scientific interests encompass
domains of human interactions and behavior in
digital environments, information systems,
knowledge management and engineering, data and
web mining, artificial intelligence, and machine
learning. His recent research focus also extends
to the spheres of service science, engineering,
management, and computing. He received several
awards in recognition of his accomplishments.
Dr. Geczy has been serving on various
professional committees, editorial boards, and
has been a distinguished speaker in academia and
industry. |
Tutorial C |
Organizer: |
Asim Roy, Arizona
State University |
|
Topic: |
Autonomous Machine
Learning |
Date & Time |
July 15, 2009
(6:00pm - 8:00pm) |
Location |
Ballroom 1 |
Description |
Autonomous machine learning has
become a top priority in science and engineering
of learning. In July 2007, NSF had a workshop on
the “Future Challenges for the Science and
Engineering of Learning.” Here is the
summary of the “Open
Questions in Both Biological and Machine
Learning”
from the workshop (http://www.cnl.salk.edu/Media/NSFWorkshopReport.v4.pdf).
“Biological learners have the ability to learn
autonomously, in an ever changing and uncertain
world. This property includes the ability to
generate their own supervision, select the most
informative training samples, produce their own
loss function, and evaluate their own
performance. More importantly, it appears that
biological learners can effectively produce
appropriate internal representations for
composable percepts
-
a kind of organizational scaffold - as part of
the learning process. By contrast, virtually all
current approaches to machine learning typically
require a human supervisor to design the
learning architecture, select the training
examples, design the form of the representation
of the training examples, choose the learning
algorithm, set the learning parameters, decide
when to stop learning, and choose the way in
which the performance of the learning algorithm
is evaluated. This strong dependence on human
supervision is greatly retarding the development
and ubiquitous deployment of autonomous
artificial learning systems. Although we are
beginning to understand some of the learning
systems used by brains, many aspects of
autonomous learning have not yet been
identified.”
This
dismal NSF characterization of the state of our
learning systems opens the door to creating
a new generation of learning algorithms.
And conferences such as DMIN could become the
focal point for research collaboration on this
new breed of learning algorithms.
The objective of this
tutorial is to present some new ideas regarding
brain-like learning, ideas that can lead to the
development of autonomous learning methods.
Autonomous learning is extremely important for
robotics. For autonomous robots that can learn
on their own like humans, we have to have
tweak-free learning algorithms that can
design and train computational structures
(e.g. neural networks) on their own without any
kind of human intervention.
Structure of the tutorial:
-
Provide an overview of
a broad set of principles
for designing and
constructing autonomous learning algorithms.
Present some new ideas about brain-like
learning that differ from current
connectionist approaches.
-
Discuss one
particular autonomous learning algorithm for
pattern classification problems. Give a
demonstration of this autonomous learning
algorithm. Summarize its basic features and
design principles.
-
As noted in the NSF report,
autonomous learning is the technology we need
and it is important that we get organized
and focus on this new breed of learning
algorithms. So there will be some open
discussion on this issue.
We could take this opportunity
to form a research group within DMIN for
collaboration on autonomous learning systems.
|
Short Bio |
Asim Roy
is a Professor of Information Systems at Arizona
State University. He received his M.S. in
Operations Research from Case Western Reserve
University, Cleveland, Ohio, and Ph.D. in
Operations Research from University of Texas at
Austin. He has been a Visiting Scholar at
Stanford University, visiting the PDP group of
David Rumelhart in the Psychology department in
the early 90s.
He was the Letters Editor of IEEE
Transactions on Neural Networks and has
served on the organizing committees of many
scientific conferences.
Asim’s research interests are in
neural networks, automated machine learning and
data mining, pattern recognition, prediction and
forecasting, intelligent systems, information
retrieval (search)
and nonlinear
multiple objective optimization.
His research has been published in Management
Science, Decision Analysis, The ORSA Journal on
Computing, Naval Research Logistics, IEEE
Transactions on Neural Networks, IEEE
Transactions on Fuzzy System, Neural Networks,
Neural Computation and other journals.
Asim has
recently published a new theory for brain-like
learning and computing. This new theory
challenges the classical ideas that have
dominated the field of brain-like computing for
the last 50 years. PhsyOrg.com recently wrote a
story on this new brain theory (http://www.physorg.com/news146319784.html).
He has been invited for plenary talks and for
tutorials, workshops and short courses on his
new learning theory and methods at many national
and international conferences. |
Tutorial D |
Organizer: |
Dan Steinberg, CEO
of Salford Systems |
|
Topic: |
A Tour of Advanced
Data Mining Methodologies |
Date & Time |
July 15, 2009
(6:00pm - 9:00/9:30pm) |
Location |
Ballroom 4 |
Description |
Abstract:
Dr. Dan
Steinberg, President and CEO of Salford Systems,
will discuss the classic CART (classification
and regression trees) technique, as well as
advanced data mining techniques recently
developed by Stanford University Professor
Jerome Friedman and University of California
Professor Emeritus Leo Breiman. Methodologies
and real-world applications will be presented
for the following:
- CART, the
classic decision tree
- MARS (multivariate
adaptive regression splines), a flexible,
highly automated regression technique
- TreeNet and
RandomForests, which leverage the predictive
power of CART models by combining a large
number of trees together using either boosting
or bootstrap aggregation approaches.
Objective:
To provide an
introduction to and overview of Data Mining
Analysis and to provide practical examples to
assist attendees in conducting their own
analyses.
Intended audience:
- Instructors
wishing to learn more about data mining so
they can include some coverage in their
classes;
- Applied
Statisticians wanting to learn new tools for
exploratory and non-parametric data analysis;
and,
- Researchers
who have previously worked with data mining
and have been mystified by earlier versions of
the documentation and output.
|
Short Bio |
Dan Steinberg, the
President and CEO of Salford Systems, founded
the company in 1983 just after receiving his
Ph.D. in Economics at Harvard.
He has also served as Assistant Professor of
Economics at the University of California, San
Diego, and participated in dozens of consulting
projects for Fortune 100 clients. Dr. Steinberg
has published articles in statistics,
econometrics, computer science, and marketing
journals, and has been a featured data mining
issues speaker for the American Marketing
Association, the American Statistical
Association, the Direct Marketing Association
and the Casualty Actuarial Society.
|
|
|