Archive-name: ai-faq/neural-nets/part4 Last-modified: 2002-08-30 URL: ftp://ftp.sas.com/pub/neural/FAQ4.html Maintainer: saswss@unx.sas.com (Warren S. Sarle)
This is part 4 (of 7) of a monthly posting to the Usenet newsgroup comp.ai.neural-nets. See the part 1 of this posting for full information what it is all about.
------------------------------------------------------------------------
There are many on-line bookstores, such as:
If you have questions on feedforward nets that aren't answered by Bishop, try Masters (1993) or Reed and Marks (1999) for practical issues or Ripley (1996) for theortical issues, all of which are reviewed below.
Bigus has written an excellent introduction to NNs for the SBE. Bigus says (p. xv), "For business executives, managers, or computer professionals, this book provides a thorough introduction to neural network technology and the issues related to its application without getting bogged down in complex math or needless details. The reader will be able to identify common business problems that are amenable to the neural netwrk approach and will be sensitized to the issues that can affect successful completion of such applications." Bigus succeeds in explaining NNs at a practical, intuitive, and necessarily shallow level without formulas--just what the SBE needs. This book is far better than Caudill and Butler (1990), a popular but disastrous attempt to explain NNs without formulas.
Chapter 1 introduces data mining and data warehousing, and sketches some applications thereof. Chapter 2 is the semi-obligatory philosophico-historical discussion of AI and NNs and is well-written, although the SBE in a hurry may want to skip it. Chapter 3 is a very useful discussion of data preparation. Chapter 4 describes a variety of NNs and what they are good for. Chapter 5 goes into practical issues of training and testing NNs. Chapters 6 and 7 explain how to use the results from NNs. Chapter 8 discusses intelligent agents. Chapters 9 through 12 contain case histories of NN applications, including market segmentation, real-estate pricing, customer ranking, and sales forecasting.
Bigus provides generally sound advice. He briefly discusses overfitting and overtraining without going into much detail, although I think his advice on p. 57 to have at least two training cases for each connection is somewhat lenient, even for noise-free data. I do not understand his claim on pp. 73 and 170 that RBF networks have advantages over backprop networks for nonstationary inputs--perhaps he is using the word "nonstationary" in a sense different from the statistical meaning of the term. There are other things in the book that I would quibble with, but I did not find any of the flagrant errors that are common in other books on NN applications such as Swingler (1996).
The one serious drawback of this book is that it is more than one page long and may therefore tax the attention span of the SBE. But any SBE who succeeds in reading the entire book should learn enough to be able to hire a good NN expert to do the real work.
Review by Ian Cresswell:
What a relief! As a broad introductory text this is without any doubt the best currently available in its area. It doesn't include source code of any kind (normally this is badly written and compiler specific). The algorithms for many different kinds of simple neural nets are presented in a clear step by step manner in plain English.Smith, M. (1996). Neural Networks for Statistical Modeling, NY: Van Nostrand Reinhold, ISBN 0-442-01310-8.Equally, the mathematics is introduced in a relatively gentle manner. There are no unnecessary complications or diversions from the main theme.
The examples that are used to demonstrate the various algorithms are detailed but (perhaps necessarily) simple.
There are bad things that can be said about most books. There are only a small number of minor criticisms that can be made about this one. More space should have been given to backprop and its variants because of the practical importance of such methods. And while the author discusses early stopping in one paragraph, the treatment of generalization is skimpy compared to the books by Weiss and Kulikowski or Smith listed above.
If you're new to neural nets and you don't want to be swamped by bogus ideas, huge amounts of intimidating looking mathematics, a programming language that you don't know etc. etc. then this is the book for you.
In summary, this is the best starting point for the outsider and/or beginner... a truly excellent text.
Weiss, S.M. and Kulikowski, C.A. (1991), Computer Systems That Learn,
Morgan Kaufmann. ISBN 1-55860-065-5.
Author's Webpage: Kulikowski: http://ruccs.rutgers.edu/faculty/kulikowski.html
Book Webpage (Publisher): http://www.mkp.com/books_catalog/1-55860-065-5.asp
Additional Information: Information of Weiss, S.M. are not available.
Briefly covers at a very elementary level feedforward nets, linear and
nearest-neighbor discriminant analysis, trees, and expert sytems,
emphasizing practical applications. For a book at this level, it has an
unusually good chapter on estimating generalization error, including
bootstrapping.
1 Overview of Learning Systems 1.1 What is a Learning System? 1.2 Motivation for Building Learning Systems 1.3 Types of Practical Empirical Learning Systems 1.3.1 Common Theme: The Classification Model 1.3.2 Let the Data Speak 1.4 What's New in Learning Methods 1.4.1 The Impact of New Technology 1.5 Outline of the Book 1.6 Bibliographical and Historical Remarks 2 How to Estimate the True Performance of a Learning System 2.1 The Importance of Unbiased Error Rate Estimation 2.2. What is an Error? 2.2.1 Costs and Risks 2.3 Apparent Error Rate Estimates 2.4 Too Good to Be True: Overspecialization 2.5 True Error Rate Estimation 2.5.1 The Idealized Model for Unlimited Samples 2.5.2 Train-and Test Error Rate Estimation 2.5.3 Resampling Techniques 2.5.4 Finding the Right Complexity Fit 2.6 Getting the Most Out of the Data 2.7 Classifier Complexity and Feature Dimensionality 2.7.1 Expected Patterns of Classifier Behavior 2.8 What Can Go Wrong? 2.8.1 Poor Features, Data Errors, and Mislabeled Classes 2.8.2 Unrepresentative Samples 2.9 How Close to the Truth? 2.10 Common Mistakes in Performance Analysis 2.11 Bibliographical and Historical Remarks 3 Statistical Pattern Recognition 3.1 Introduction and Overview 3.2 A Few Sample Applications 3.3 Bayesian Classifiers 3.3.1 Direct Application of the Bayes Rule 3.4 Linear Discriminants 3.4.1 The Normality Assumption and Discriminant Functions 3.4.2 Logistic Regression 3.5 Nearest Neighbor Methods 3.6 Feature Selection 3.7 Error Rate Analysis 3.8 Bibliographical and Historical Remarks 4 Neural Nets 4.1 Introduction and Overview 4.2 Perceptrons 4.2.1 Least Mean Square Learning Systems 4.2.2 How Good Is a Linear Separation Network? 4.3 Multilayer Neural Networks 4.3.1 Back-Propagation 4.3.2 The Practical Application of Back-Propagation 4.4 Error Rate and Complexity Fit Estimation 4.5 Improving on Standard Back-Propagation 4.6 Bibliographical and Historical Remarks 5 Machine Learning: Easily Understood Decision Rules 5.1 Introduction and Overview 5.2 Decision Trees 5.2.1 Finding the Perfect Tree 5.2.2 The Incredible Shrinking Tree 5.2.3 Limitations of Tree Induction Methods 5.3 Rule Induction 5.3.1 Predictive Value Maximization 5.4 Bibliographical and Historical Remarks 6 Which Technique is Best? 6.1 What's Important in Choosing a Classifier? 6.1.1 Prediction Accuracy 6.1.2 Speed of Learning and Classification 6.1.3 Explanation and Insight 6.2 So, How Do I Choose a Learning System? 6.3 Variations on the Standard Problem 6.3.1 Missing Data 6.3.2 Incremental Learning 6.4 Future Prospects for Improved Learning Methods 6.5 Bibliographical and Historical Remarks 7 Expert Systems 7.1 Introduction and Overview 7.1.1 Why Build Expert Systems? New vs. Old Knowledge 7.2 Estimating Error Rates for Expert Systems 7.3 Complexity of Knowledge Bases 7.3.1 How Many Rules Are Too Many? 7.4 Knowledge Base Example 7.5 Empirical Analysis of Knowledge Bases 7.6 Future: Combined Learning and Expert Systems 7.7 Bibliographical and Historical Remarks
Reed, R.D., and Marks, R.J, II (1999),
Neural Smithing: Supervised Learning in Feedforward Artificial
Neural Networks,
Cambridge, MA: The MIT Press, ISBN 0-262-18190-8.
Author's Webpage: Marks: http://cialab.ee.washington.edu/Marks.html
Book Webpage (Publisher):
http://mitpress.mit.edu/book-home.tcl?isbn=0262181908
After you have read Smith (1996) or Weiss and Kulikowski (1991), consult
Reed and Marks for practical details on training MLPs (other types of
neural nets such as RBF networks are barely even mentioned). They
provide extensive coverage of backprop and its variants, and they also
survey conventional optimization algorithms. Their coverage of
initialization methods, constructive networks, pruning, and
regularization methods is unusually thorough. Unlike the vast majority
of books on neural nets, this one has lots of really informative graphs.
The chapter on generalization assessment is slightly weak, which is why
you should read Smith (1996) or Weiss and Kulikowski (1991) first. Also,
there is little information on data preparation, for which Smith (1996)
and Masters (1993; see below) should be consulted. There is some
elementary calculus, but not enough that it should scare off anybody.
Many second-rate books treat neural nets as mysterious black boxes, but
Reed and Marks open up the box and provide genuine insight into the way
neural nets work.
One problem with the book is that the terms "validation set" and "test set" are used inconsistently.
Chapter headings: Supervised Learning; Single-Layer Networks; MLP Representational Capabilities; Back-Propagation; Learning Rate and Momentum; Weight-Initialization Techniques; The Error Surface; Faster Variations of Back-Propagation; Classical Optimization Techniques; Genetic Algorithms and Neural Networks; Constructive Methods; Pruning Algorithms; Factors Influencing Generalization; Generalization Prediction and Assessment; Heuristics for Improving Generalization; Effects of Training with Noisy Inputs; Linear Regression; Principal Components Analysis; Jitter Calculations; Sigmoid-like Nonlinear Functions
Masters, T. (1995) Advanced Algorithms for Neural Networks:
A C++ Sourcebook,
NY: John Wiley and Sons, ISBN 0-471-10588-0
Book Webpage (Publisher): http://www.wiley.com/
Additional Information: One has to search.
Clear explanations of conjugate gradient and Levenberg-Marquardt
optimization algorithms, simulated annealing, kernel regression (GRNN)
and discriminant analysis (PNN), Gram-Charlier networks, dimensionality
reduction, cross-validation, and bootstrapping.
Masters, T. (1994), Signal and Image Processing with Neural
Networks: A C++ Sourcebook,
NY: Wiley, ISBN 0-471-04963-8.
Book Webpage (Publisher): http://www.wiley.com/
Additional Information: One has to search.
Geoffrey Hinton writes in the foreword:
"Bishop is a leading researcher who has a deep understanding of the
material and has gone to great lengths to organize it in a sequence that
makes sense. He has wisely avoided the temptation to try to cover
everything and has therefore omitted interesting topics like
reinforcement learning, Hopfield networks, and Boltzmann machines in
order to focus on the types of neural networks that are most widely used
in practical applications. He assumes that the reader has the basic
mathematical literacy required for an undergraduate science degree, and
using these tools he explains everything from scratch. Before
introducing the multilayer perceptron, for example, he lays a solid
foundation of basic statistical concepts. So the crucial concept of
overfitting is introduced using easily visualized examples of
one-dimensional polynomials and only later applied to neural networks.
An impressive aspect of this book is that it takes the reader all the
way from the simplest linear models to the very latest Bayesian
multilayer neural networks without ever requiring any great intellectual
leaps."
Chapter headings: Statistical Pattern Recognition; Probability Density Estimation; Single-Layer Networks; The Multi-layer Perceptron; Radial Basis Functions; Error Functions; Parameter Optimization Algorithms; Pre-processing and Feature Extraction; Learning and Generalization; Bayesian Techniques; Symmetric Matrices; Gaussian Integrals; Lagrange Multipliers; Calculus of Variations; Principal Components.
Hertz, J., Krogh, A., and Palmer, R. (1991).
Introduction to the Theory of Neural Computation.
Redwood City, CA: Addison-Wesley,
ISBN 0-201-50395-6 (hardbound) and 0-201-51560-1 (paperbound)
Book Webpage (Publisher):
http://www2.awl.com/gb/abp/sfi/computer.html
This is an excellent classic work on neural nets from the perspective of
physics covering a wide variety of networks.
Comments from readers of comp.ai.neural-nets: "My first
impression is that this one is by far the best book on the topic. And
it's below $30 for the paperback."; "Well written, theoretical (but not
overwhelming)"; It provides a good balance of model development,
computational algorithms, and applications. The mathematical derivations
are especially well done"; "Nice mathematical analysis on the mechanism
of different learning algorithms"; "It is NOT for mathematical beginner.
If you don't have a good grasp of higher level math, this book can be
really tough to get through."
Devroye, L., Györfi, L., and Lugosi, G. (1996),
A Probabilistic Theory of Pattern Recognition,
NY: Springer, ISBN 0-387-94618-7, vii+636 pages.
This book has relatively little material explicitly about neural nets,
but what it has is very interesting and much of it is not found in other
texts. The emphasis is on statistical proofs of universal consistency
for a wide variety of methods, including histograms, (k) nearest neighbors, kernels
(PNN), trees, generalized linear discriminants, MLPs, and RBF networks.
There is also considerable material on validation and cross-validation.
The authors say, "We did not scar the pages with backbreaking
simulations or quick-and-dirty engineering solutions" (p. 7).
The formula-to-text ratio is high, but the writing is quite
clear, and anyone who has had a year or two of mathematical statistics
should be able to follow the exposition.
Chapter headings: The Bayes Error; Inequalities and Alternate Distance
Measures; Linear Discrimination; Nearest Neighbor Rules; Consistency;
Slow Rates of Convergence; Error Estimation; The Regular Histogram Rule;
Kernel Rules; Consistency of the k-Nearest Neighbor Rule;
Vapnik-Chervonenkis Theory; Combinatorial Aspects of Vapnik-Chervonenkis
Theory; Lower Bounds for Empirical Classifier Selection; The Maximum
Likelihood Principle; Parametric Classification; Generalized Linear
Discrimination; Complexity Regularization; Condensed and Edited Nearest
Neighbor Rules; Tree Classifiers; Data-Dependent Partitioning; Splitting
the Data; The Resubstitution Estimate; Deleted Estimates of the Error
Probability; Automatic Kernel Rules; Automatic Nearest Neighbor Rules;
Hypercubes and Discrete Spaces; Epsilon Entropy and Totally Bounded
Sets; Uniform Laws of Large Numbers; Neural Networks; Other Error
Estimates; Feature Extraction.
Hagan, M.T., Demuth, H.B., and Beale, M. (1996),
Neural Network Design,
Boston: PWS, ISBN 0-534-94332-2.
It doesn't really say much about design, but this book provides
formulas and examples in excruciating detail for a wide variety
of networks. It also includes some mathematical background material.
Chapter headings: Neuron Model and Network Architectures; An
Illustrative Example; Perceptron Learning Rule; Signal and Weight Vector
Spaces; Linear Transformations for Neural; Networks; Supervised Hebbian
Learning; Performance Surfaces and Optimum Points; Performance
Optimization; Widrow-Hoff Learning; Backpropagation; Variations on
Backpropagation; Associative Learning; Competitive Networks; Grossberg
Network; Adaptive Resonance Theory; Stability; Hopfield Network.
Abdi, H., Valentin, D., and Edelman, B. (1999),
Neural Networks,
Sage University Papers Series on Quantitative Applications in the
Social Sciences, 07-124, Thousand Oaks, CA: Sage, ISBN 0-7619-1440-4.
Inexpensive, brief (89 pages) but very detailed explanations of
linear networks and the basics of backpropagation.
Chapter headings: 1. Introduction 2. The Perceptron 3. Linear
Autoassociative Memories 4. Linear Heteroassociative Memories 5. Error
Backpropagation 6. Useful References.
Rolls, E.T., and Treves, A. (1997),
Neural Networks and Brain Function,
Oxford: Oxford University Press, ISBN: 0198524323.
Chapter headings: Introduction; Pattern association memory;
Autoassociation memory; Competitive networks, including self-organizing
maps; Error-correcting networks: perceptrons, the delta rule,
backpropagation of error in multilayer networks, and reinforcement
learning algorithms; The hippocampus and memory; Pattern association in
the brain: amygdala and orbitofrontal cortex; Cortical networks for
invariant pattern recognition; Motor systems: cerebellum and basal
ganglia; Cerebral neocortex.
Schmajuk, N.A. (1996)
Animal Learning and Cognition: A Neural Network Approach,
Cambridge: Cambridge University Press, ISBN 0521456967.
Chapter headings: Neural networks and associative learning Classical
conditioning: data and theories; Cognitive mapping; Attentional
processes; Storage and retrieval processes; Configural processes;
Timing; Operant conditioning and animal communication: data, theories,
and networks; Animal cognition: data and theories; Place learning and
spatial navigation; Maze learning and cognitive mapping; Learning,
cognition, and the hippocampus: data and theories; Hippocampal
modulation of learning and cognition; The character of the psychological
law.
Arbib, M.A., ed. (1995),
The Handbook of Brain Theory and Neural Networks,
Cambridge, MA: The MIT Press, ISBN 0-262-51102-9.
From The Publisher:
The heart of the book, part III, comprises of 267 original articles by
leaders in the various fields, arranged alphabetically by title. Parts I
and II, written by the editor, are designed to help readers orient
themselves to this vast range of material. Part I, Background,
introduces several basic neural models, explains how the present study
of brain theory and neural networks integrates brain theory, artificial
intelligence, and cognitive psychology, and provides a tutorial on the
concepts essential for understanding neural networks as dynamic,
adaptive systems. Part II, Road Maps, provides entry into the many
articles of part III through an introductory "Meta-Map" and twenty-three
road maps, each of which tours all the Part III articles on the chosen
theme.
Touretzky, D., Hinton, G, and Sejnowski, T., eds., (1989) Proceedings of the 1988 Connectionist Models Summer School, San Mateo, CA: Morgan Kaufmann, ISBN: 1558600337
NIPS:
Plunkett, K., and Elman, J.L. (1997),
Exercises in Rethinking Innateness: A Handbook for Connectionist Simulations,
Cambridge, MA: The MIT Press, ISBN: 0262661055.
Chapter headings: Introduction and overview; The methodology of
simulations; Learning to use the simulator; Learning internal
representations; Autoassociation; Generalization; Translation
invariance; Simple recurrent networks; Critical points in learning;
Modeling stages in cognitive development; Learning the English past
tense; The importance of starting small.
Husmeier, D. (1999), Neural Networks for Conditional Probability Estimation: Forecasting Beyond Point Predictions, Berlin: Springer Verlag, ISBN 185233095.
Kosko, B. (1997), Fuzzy Engineering,
Upper Saddle River, NJ: Prentice Hall, ISBN 0-13-124991-6.
Kosko's new book is a big improvement over his older neurofuzzy book
and makes an excellent sequel to Brown and Harris (1994).
Nauck, D., Klawonn, F., and Kruse, R. (1997),
Foundations of Neuro-Fuzzy Systems,
Chichester: Wiley, ISBN 0-471-97151-0.
Chapter headings: Historical and Biological Aspects; Neural Networks;
Fuzzy Systems; Modelling Neuro-Fuzzy Systems; Cooperative Neuro-Fuzzy
Systems; Hybrid Neuro-Fuzzy Systems; The Generic Fuzzy Perceptron;
NEFCON - Neuro-Fuzzy Control; NEFCLASS - Neuro-Fuzzy Classification;
NEFPROX - Neuro-Fuzzy Function Approximation; Neural Networks and Fuzzy
Prolog; Using Neuro-Fuzzy Systems.
Among conceptually-integrated books, there are two excellent books that use the Vapnil-Chervonenkis theory as a unifying theme, and provide strong coverage of support vector machines and fuzzy logic, as well as neural nets. Of these two, Kecman (2001) provides clearer explanations and better diagrams, but Cherkassky and Mulier (1998) are better organized have an excellent section on unsupervised learning, especially self-organizing maps. I have been tempted to add both of these books to the "best" list, but I have not done so because I think VC theory is of doubtful practical utility for neural nets. However, if you are especially interested in VC theory and support vector machines, then both of these books can be highly recommended. To help you choose between them, a detailed table of contents is provided below for each book.
Haykin, S. (1999),
Neural Networks: A Comprehensive Foundation, 2nd ed.,
Upper Saddle River, NJ: Prentice Hall, ISBN 0-13-273350-1.
The second edition is much better than the first, which has been
described as a core-dump of Haykin's brain. The second edition covers
more topics, is easier to understand, and has better examples.
Chapter headings: Introduction; Learning Processes; Single Layer
Perceptrons; Multilayer Perceptrons; Radial-Basis Function Networks;
Support Vector Machines; Committee Machines; Principal Components
Analysis; Self-Organizing Maps; Information-Theoretic Models; Stochastic
Machines And Their Approximates Rooted in Statistical Mechanics;
Neurodynamic Programming; Temporal Processing Using Feedforward
Networks; Neurodynamics; Dynamically Driven Recurrent Networks.
Kecman, V. (2001),
Learning and Soft Computing: Support Vector Machines,
Neural Networks, and Fuzzy Logic Models,
Cambridge, MA: The MIT Press; ISBN: 0-262-11255-8.
URL: http://www.support-vector.ws/
Detailed Table of Contents: 1. Learning and Soft Computing: Rationale, Motivations, Needs, Basics 1.1 Examples of Applications in Diverse Fields 1.2 Basic Tools of Soft Computing: Neural Networks, Fuzzy Logic Systems, and Support Vector Machines 1.2.1 Basics of Neural Networks 1.2.2 Basics of Fuzzy Logic Modeling 1.3 Basic Mathematics of Soft Computing 1.3.1 Approximation of Multivariate Functions 1.3.2 Nonlinear Error Surface and Optimization 1.4 Learning and Statistical Approaches to Regression and Classification 1.4.1 Regression 1.4.2 Classification Problems Simulation Experiments 2. Support Vector Machines 2.1 Risk Minimization Principles and the Concept of Uniform Convergence 2.2 The VC Dimension 2.3 Structural Risk Minimization 2.4 Support Vector Machine Algorithms 2.4.1 Linear Maximal Margin Classifier for Linearly Separable Data 2.4.2 Linear Soft Margin Classifier for Overlapping Classes 2.4.3 The Nonlinear Classifier 2.4.4 Regression by Support Vector Machines Problems Simulation Experiments 3. Single-Layer Networks 3.1 The Perceptron 3.1.1 The Geometry of Perceptron Mapping 3.1.2 Convergence Theorem and Perceptron Learning Rule 3.2 The Adaptive Linear Neuron (Adaline) and the Least Mean Square Algorithm 3.2.1 Representational Capabilities of the Adaline 3.2.2 Weights Learning for a Linear Processing Unit Problems Simulation Experiments 4. Multilayer Perceptrons 4.1 The Error Backpropagation Algorithm 4.2 The Generalized Delta Rule 4.3 Heuristics or Practical Aspects of the Error Backpropagation Algorithm 4.3.1 One, Two, or More Hidden Layers? 4.3.2 Number of Neurons in a Hidden Layer, or the Bias-Variance Dilemma 4.3.3 Type of Activation Functions in a Hidden Layer and the Geometry of Approximation 4.3.4 Weights Initialization 4.3.5 Error Function for Stopping Criterion at Learning 4.3.6 Learning Rate and the Momentum Term Problems Simulation Experiments 5. Radial Basis Function Networks 5.1 Ill-Posed Problems and the Regularization Technique 5.2 Stabilizers and Basis Functions 5.3 Generalized Radial Basis Function Networks 5.3.1 Moving Centers Learning 5.3.2 Regularization with Nonradial Basis Functions 5.3.3 Orthogonal Least Squares 5.3.4 Optimal Subset Selection by Linear Programming Problems Simulation Experiments 6. Fuzzy Logic Systems 6.1 Basics of Fuzzy Logic Theory 6.1.1 Crisp (or Classic) and Fuzzy Sets 6.1.2 Basic Set Operations 6.1.3 Fuzzy Relations 6.1.4 Composition of Fuzzy Relations 6.1.5 Fuzzy Inference 6.1.6 Zadeh's Compositional Rule of Inference 6.1.7 Defuzzification 6.2 Mathematical Similarities between Neural Networks and Fuzzy Logic Models 6.3 Fuzzy Additive Models Problems Simulation Experiments 7. Case Studies 7.1 Neural Networks-Based Adaptive Control 7.1.1 General Learning Architecture, or Direct Inverse Modeling 7.1.2 Indirect Learning Architecture 7.1.3 Specialized Learning Architecture 7.1.4 Adaptive Backthrough Control 7.2 Financial Time Series Analysis 7.3 Computer Graphics 7.3.1 One-Dimensional Morphing 7.3.2 Multidimensional Morphing 7.3.3 Radial Basis Function Networks for Human Animation 7.3.4 Radial Basis Function Networks for Engineering Drawings 8. Basic Nonlinear Optimization Methods 8.1 Classical Methods 8.1.1 Newton-Raphson Method 8.1.2 Variable Metric or Quasi-Newton Methods 8.1.3 Davidon-Fletcher-Powel Method 8.1.4 Broyden-Fletcher-Go1dfarb-Shano Method 8.1.5 Conjugate Gradient Methods 8.1.6 Fletcher-Reeves Method 8.1.7 Polak-Ribiere Method 8.1.8 Two Specialized Algorithms for a Sum-of-Error-Squares Error Function Gauss-Newton Method Levenberg-Marquardt Method 8.2 Genetic Algorithms and Evolutionary Computing 8.2.1 Basic Structure of Genetic Algorithms 8.2.2 Mechanism of Genetic Algorithms 9. Mathematical Tools of Soft Computing 9.1 Systems of Linear Equations 9.2 Vectors and Matrices 9.3 Linear Algebra and Analytic Geometry 9.4 Basics of Multivariable Analysis 9.5 Basics from Probability TheoryCherkassky, V.S., and Mulier, F.M. (1998), Learning from Data : Concepts, Theory, and Methods, NY: John Wiley & Sons; ISBN: 0-471-15493-8.
Detailed Table of Contents: 1 Introduction 1.1 Learning and Statistical Estimation 1.2 Statistical Dependency and Causality 1.3 Characterization of Variables 1.4 Characterization of Uncertainty References 2 Problem Statement, Classical Approaches, and Adaptive Learning 2.1 Formulation of the Learning Problem 2.1.1 Role of the Learning Machine 2.1.2 Common Learning Tasks 2.1.3 Scope of the Learning Problem Formulation 2.2 Classical Approaches 2.2.1 Density Estimation 2.2.2 Classification (Discriminant Analysis) 2.2.3 Regression 2.2.4 Stochastic Approximation 2.2.5 Solving Problems with Finite Data 2.2.6 Nonparametric Methods 2.3 Adaptive Learning: Concepts and Inductive Principles 2.3.1 Philosophy, Major Concepts, and Issues 2.3.2 A priori Knowledge and Model Complexity 2.3.3 Inductive Principles 2.4 Summary References 3 Regularization Framework 3.1 Curse and Complexity of Dimensionality 3.2 Function Approx. and Characterization of Complexity 3.3 Penalization 3.3.1 Parametric Penalties 3.3.2 Nonparametric Penalties 3.4 Model Selection (Complexity Control) 3.4.1 Analytical Model Selection Criteria 3.4.2 Model Selection via Resampling 3.4.3 Bias-variance Trade-off 3.4.4 Example of Model Selection 3.5 Summary References 4 Statistical Learning Theory 4.1 Conditions for Consistency and Convergence of ERM 4.2 Growth Function and VC-Dimension 4.2.1 VC-Dimension of the Set of Real-Valued Functions 4.2.2 VC-Dim. for Classification and Regression Problems 4.2.3 Examples of Calculating VC-Dimension 4.3 Bounds on the Generalization 4.3.1 Classification 4.3.2 Regression 4.3.3 Generalization Bounds and Sampling Theorem 4.4 Structural Risk Minimization 4.5 Case Study: Comparison of Methods for Model Selection 4.6 Summary References 5 Nonlinear Optimization Strategies 5.1 Stochastic Approximation Methods 5.1.1 Linear Parameter Estimation 5.1.2 Backpropagation Training of MLP Networks 5.2 Iterative Methods 5.2.1 Expectation-Maximization Methods for Density Est. 5.2.2 Generalized Inverse Training of MLP Networks 5.3 Greedy Optimization 5.3.1 Neural Network Construction Algorithms 5.3.2 Classification and Regression Trees (CART) 5.4 Feature Selection, Optimization, and Stat. Learning Th. 5.5 Summary References 6 Methods for Data Reduction and Dim. Reduction 6.1 Vector Quantization 6.1.1 Optimal Source Coding in Vector Quantization 6.1.2 Generalized Lloyd Algorithm 6.1.3 Clustering and Vector Quantization 6.1.4 EM Algorithm for VQ and Clustering 6.2 Dimensionality Reduction: Statistical Methods 6.2.1 Linear Principal Components 6.2.2 Principal Curves and Surfaces 6.3 Dimensionality Reduction: Neural Network Methods 6.3.1 Discrete Principal Curves and Self-org. Map Alg. 6.3.2 Statistical Interpretation of the SOM Method 6.3.3 Flow-through Version of the SOM and Learning Rate Schedules 6.3.4 SOM Applications and Modifications 6.3.5 Self-supervised MLP 6.4 Summary References 7 Methods for Regression 7.1 Taxonomy: Dictionary versus Kernel Representation 7.2 Linear Estimators 7.2.1 Estimation of Linear Models and Equivalence of Representations 7.2.2 Analytic Form of Cross-validation 7.2.3 Estimating Complexity of Penalized Linear Models 7.3 Nonadaptive Methods 7.3.1 Local Polynomial Estimators and Splines 7.3.2 Radial Basis Function Networks 7.3.3 Orthogonal Basis Functions and Wavelets 7.4 Adaptive Dictionary Methods 7.4.1 Additive Methods and Projection Pursuit Regression 7.4.2 Multilayer Perceptrons and Backpropagation 7.4.3 Multivariate Adaptive Regression Splines 7.5 Adaptive Kernel Methods and Local Risk Minimization 7.5.1 Generalized Memory-Based Learning 7.5.2 Constrained Topological Mapping 7.6 Empirical Comparisons 7.6.1 Experimental Setup 7.6.2 Summary of Experimental Results 7.7 Combining Predictive Models 7.8 Summary References 8 Classification 8.1 Statistical Learning Theory formulation 8.2 Classical Formulation 8.3 Methods for Classification 8.3.1 Regression-Based Methods 8.3.2 Tree-Based Methods 8.3.3 Nearest Neighbor and Prototype Methods 8.3.4 Empirical Comparisons 8.4 Summary References 9 Support Vector Machines 9.1 Optimal Separating Hyperplanes 9.2 High Dimensional Mapping and Inner Product Kernels 9.3 Support Vector Machine for Classification 9.4 Support Vector Machine for Regression 9.5 Summary References 10 Fuzzy Systems 10.1 Terminology, Fuzzy Sets, and Operations 10.2 Fuzzy Inference Systems and Neurofuzzy Systems 10.2.1 Fuzzy Inference Systems 10.2.2 Equivalent Basis Function Representation 10.2.3 Learning Fuzzy Rules from Data 10.3 Applications in Pattern Recognition 10.3.1 Fuzzy Input Encoding and Fuzzy Postprocessing 10.3.2 Fuzzy Clustering 10.4 Summary References Appendix A: Review of Nonlinear Optimization Appendix B: Eigenvalues and Singular Value Decomposition
Rosenblatt, F. (1962), Principles of Neurodynamics, NY: Spartan Books. Out of print.
Anderson, J.A., and Rosenfeld, E., eds. (1988),
Neurocomputing: Foundatons of Research,
Cambridge, MA: The MIT Press, ISBN 0-262-01097-6.
Author's Webpage: http://www.cog.brown.edu/~anderson
Book Webpage (Publisher):
http://mitpress.mit.edu/book-home.tcl?isbn=0262510480
43 articles of historical importance, ranging from William James
to Rumelhart, Hinton, and Williams.
Anderson, J. A., Pellionisz, A. and Rosenfeld, E. (Eds). (1990).
Neurocomputing 2: Directions for Research.
The MIT Press: Cambridge, MA.
Author's Webpage: http://www.cog.brown.edu/~anderson
Book Webpage (Publisher):
http://mitpress.mit.edu/book-home.tcl?isbn=0262510758
Carpenter, G.A., and Grossberg, S., eds. (1991),
Pattern Recognition by Self-Organizing Neural Networks,
Cambridge, MA: The MIT Press, ISBN 0-262-03176-0
Articles on ART, BAM, SOMs, counterpropagation, etc.
Nilsson, N.J. (1965/1990), Learning Machines, San Mateo, CA: Morgan Kaufmann, ISBN 1-55860-123-6.
Minsky, M.L., and Papert, S.A. (1969/1988) Perceptrons, Cambridge, MA: The MIT Press, 1st ed. 1969, expanded edition 1988 ISBN 0-262-63111-3.
Werbos, P.J. (1994), The Roots of Backpropagation, NY: John Wiley & Sons, ISBN: 0471598976. Includes Werbos's 1974 Harvard Ph.D. thesis, Beyond Regression.
Kohonen, T. (1984/1989),
Self-organization and Associative Memory,
1st ed. 1988, 3rd ed. 1989, NY: Springer.
Author's Webpage: http://www.cis.hut.fi/nnrc/teuvo.html
Book Webpage (Publisher): http://www.springer.de/
Additional Information: Book is out of print.
Rumelhart, D. E. and McClelland, J. L. (1986),
Parallel Distributed Processing: Explorations in the
Microstructure of Cognition,
Volumes 1 & 2, Cambridge, MA: The MIT Press ISBN 0-262-63112-1.
Author's Webpage:
http://www-med.stanford.edu/school/Neurosciences/faculty/rumelhart.html
Book Webpage (Publisher):
http://mitpress.mit.edu/book-home.tcl?isbn=0262631121
Hecht-Nielsen, R. (1990),
Neurocomputing,
Reading, MA: Addison-Wesley, ISBN 0-201-09355-3.
Book Webpage (Publisher): http://www.awl.com/
Anderson, J.A., and Rosenfeld, E., eds. (1998), Talking Nets: An Oral History of Neural Networks, Cambridge, MA: The MIT Press, ISBN 0-262-51111-8.
Cloete, I., and Zurada, J.M. (2000),
Knowledge-Based Neurocomputing,
Cambridge, MA: The MIT Press, ISBN 0-262-03274-0.
Articles: Knowledge-Based Neurocomputing: Past, Present, and Future;
Architectures and Techniques for Knowledge-Based Neurocomputing;
Symbolic Knowledge Representation in Recurrent Neural Networks: Insights
from Theoretical Models of Computation; A Tutorial on Neurocomputing of
Structures; Structural Learning and Rule Discovery; VL[subscript 1]ANN:
Transformation of Rules to Artificial Neural Networks; Integrations of
Heterogeneous Sources of Partial Domain Knowledge; Approximation of
Differential Equations Using Neural Networks; Fynesse: A Hybrid
Architecture for Self-Learning Control; Data Mining Techniques for
Designing Neural Network Time Series Predictors; Extraction of Decision
Trees from Artificial Neural Networks 369; Extraction of Linguistic
Rules from Data via Neural Networks and Fuzzy Approximation; Neural
Knowledge Processing in Expert Systems.
Anthony, M., and Bartlett, P.L. (1999), Neural Network Learning: Theoretical Foundations, Cambridge: Cambridge University Press, ISBN 0-521-57353-X.
Vapnik, V.N. (1998)
Statistical Learning Theory,
NY: Wiley, ISBN: 0471030031
This book is much better than Vapnik's The Nature of Statistical
Learning Theory.
Chapter headings:
0. Introduction: The Problem of Induction and Statistical Inference;
1. Two Approaches to the Learning Problem;
Appendix: Methods for Solving Ill-Posed Problems;
2. Estimation of the Probability Measure and Problem of Learning;
3. Conditions for Consistency of Empirical Risk Minimization Principle;
4. Bounds on the Risk for Indicator Loss Functions;
Appendix: Lower Bounds on the Risk of the ERM Principle;
5. Bounds on the Risk for Real-Valued Loss Functions;
6. The Structural Risk Minimization Principle;
Appendix: Estimating Functions on the Basis of Indirect Measurements;
7. Stochastic Ill-Posed Problems;
8. Estimating the Values of Functions at Given Points;
9. Perceptrons and Their Generalizations;
10. The Support Vector Method for Estimating Indicator Functions;
11. The Support Vector Method for Estimating Real-Valued Functions;
12. SV Machines for Pattern Recognition;
(includes examples of digit recognition)
13. SV Machines for Function Approximations, Regression Estimation, and Signal Processing;
(includes an example of positron emission tomography)
14. Necessary and Sufficient Conditions for Uniform Convergence of Frequencies to Their Probabilities;
15. Necessary and Sufficient Conditions for Uniform Convergence of Means to Their Expectations;
16. Necessary and Sufficient Conditions for Uniform One-Sided Convergence of Means to Their Expectations;
Comments and Bibliographical Remarks.
There are many excellent books about NNs by Timothy Masters (listed elsewhere in the FAQ) that provide C++ code for NNs. If you simply want code that works, these books should satisfy your needs. If you want code that exemplifies the highest standards of object oriented design, you will be disappointed by Masters.
The one book on OOP for NNs that seems to be consistently praised is:
Rogers, Joey (1996),
Object-Oriented Neural Networks in C++,
Academic Press, ISBN 0125931158.
Contents:
1. Introduction
2. Object-Oriented Programming Review
3. Neural-Network Base Classes
4. ADALINE Network
5. Backpropagation Neural Network
6. Self-Organizing Neural Network
7. Bidirectional Associative Memory
Appendix A Support Classes
Appendix B Listings
References and Suggested Reading
However, you will learn very little about NNs other than elementary programming techniques from Rogers. To quote a customer review at the Barnes & Noble web site (http://www.bn.com):
A reviewer, a scientific programmer, July 19, 2000, **** Long explaination of neural net code - not of neural nets Good OO code for simple 'off the shelf' implementation, very open & fairly extensible for further cusomization. A complete & lucid explanation of the code but pretty weak on the principles, theory, and application of neural networks. Great as a code source, disappointing as a neural network tutorial.
Bertsekas, D. P. and Tsitsiklis, J. N. (1996), Neuro-Dynamic
Programming,
Belmont, MA: Athena Scientific, ISBN 1-886529-10-8.
Author's Webpage: http://www.mit.edu:8001/people/dimitrib/home.html
and http://web.mit.edu/jnt/www/home.html
Book Webpage (Publisher):http://world.std.com/~athenasc/ndpbook.html
Kay, J.W., and Titterington, D.M. (1999)
Statistics and Neural Networks: Advances at the Interface,
Oxford: Oxford University Press, ISBN 0-19-852422-6.
Articles: Flexible Discriminant and Mixture Models; Neural Networks for
Unsupervised Learning Based on Information Theory; Radial Basis Function
Networks and Statistics; Robust Prediction in Many-parameter Models;
Density Networks; Latent Variable Models and Data Visualisation;
Analysis of Latent Structure Models with Multidimensional Latent
Variables; Artificial Neural Networks and Multivariate Statistics.
White, H. (1992b),
Artificial Neural Networks: Approximation and Learning Theory,
Blackwell, ISBN: 1557863296.
Articles: There Exists a Neural Network That Does Not Make Avoidable
Mistakes; Multilayer Feedforward Networks Are Universal Approximators;
Universal Approximation Using Feedforward Networks with Non-sigmoid
Hidden Layer Activation Functions; Approximating and Learning Unknown
Mappings Using Multilayer Feedforward Networks with Bounded Weights;
Universal Approximation of an Unknown Mapping and Its Derivatives;
Neural Network Learning and Statistics; Learning in Artificial Neural
Networks: a Statistical Perspective; Some Asymptotic Results for
Learning in Single Hidden Layer Feedforward Networks; Connectionist
Nonparametric Regression: Multilayer Feedforward Networks Can Learn
Arbitrary Mappings; Nonparametric Estimation of Conditional Quantiles
Using Neural Networks; On Learning the Derivatives of an Unknown Mapping
with Multilayer Feedforward Networks; Consequences and Detection of
Misspecified Nonlinear Regression Models; Maximum Likelihood Estimation
of Misspecified Models; Some Results for Sieve Estimation with Dependent
Observations.
Deco, G. and Obradovic, D. (1996), An Information-Theoretic Approach to Neural Computing, NY: Springer-Verlag, ISBN 0-387-94666-7.
Diamantaras, K.I., and Kung, S.Y. (1996) Principal Component Neural Networks: Theory and Applications, NY: Wiley, ISBN 0-471-05436-4.
Van Hulle, M.M. (2000), Faithful Representations and Topographic Maps: From Distortion- to Information-Based Self-Organization, NY: Wiley, ISBN 0-471-34507-5.
[The on-line delta] rule always takes the most efficient route from the current position of the weight vector to the "ideal" position, based on the current input pattern. The delta rule not only minimizes the mean squared error, it does so in the most efficient fashion possible--quite an achievement for such a simple rule.While the authors realize that backpropagation networks can suffer from local minima, they mistakenly think that counterpropagation has some kind of global optimization ability (p. 202):
Unlike the backpropagation network, a counterpropagation network cannot be fooled into finding a local minimum solution. This means that the network is guaranteed to find the correct response (or the nearest stored response) to an input, no matter what.But even though they acknowledge the problem of local minima, the authors are ignorant of the importance of initial weight values (p. 186):
To teach our imaginary network something using backpropagation, we must start by setting all the adaptive weights on all the neurodes in it to random values. It won't matter what those values are, as long as they are not all the same and not equal to 1.Like most introductory books, this one neglects the difficulties of getting good generalization--the authors simply declare (p. 8) that "A neural network is able to generalize"!
Chester, M. (1993). Neural Networks: A Tutorial,
Englewood Cliffs, NJ: PTR Prentice Hall.
Book Webpage (Publisher): http://www.prenhall.com/
Additional Information: Seems to be out of print.
Shallow, sometimes confused, especially with regard to Kohonen networks.
Dayhoff, J. E. (1990). Neural Network Architectures: An Introduction.
Van Nostrand Reinhold: New York.
Comments from readers of comp.ai.neural-nets: "Like Wasserman's book, Dayhoff's book is also very easy to
understand".
Freeman, James (1994). Simulating Neural Networks with
Mathematica, Addison-Wesley, ISBN: 0-201-56629-X.
Book Webpage (Publisher):
http://cseng.aw.com/bookdetail.qry?ISBN=0-201-56629-X&ptype=0
Additional Information: Sourcecode available under:
ftp://ftp.mathsource.com/pub/Publications/BookSupplements/Freeman-1993
Helps the reader make his own NNs. The mathematica code for the programs in the book
is also available through the internet: Send mail to
MathSource@wri.com or try
http://www.wri.com/ on the World Wide
Web.
Freeman, J.A. and Skapura, D.M. (1991). Neural Networks:
Algorithms, Applications, and Programming Techniques,
Reading, MA: Addison-Wesley.
Book Webpage (Publisher): http://www.awl.com/
Additional Information: Seems to be out of print.
A good book for beginning programmers who want to learn how to write
NN programs while avoiding any understanding of what NNs do or why
they do it.
Gately, E. (1996). Neural Networks for Financial Forecasting.
New York: John Wiley and Sons, Inc.
Book Webpage (Publisher): http://www.wiley.com/
Additional Information: One has to search.
Franco Insana comments:
* Decent book for the neural net beginner * Very little devoted to statistical framework, although there is some formulation of backprop theory * Some food for thought * Nothing here for those with any neural net experience
McClelland, J. L. and Rumelhart, D. E. (1988).
Explorations in Parallel Distributed Processing: Computational Models of
Cognition and Perception (software manual). The MIT Press.
Book Webpage (Publisher):
http://mitpress.mit.edu/book-home.tcl?isbn=026263113X (IBM version) and
http://mitpress.mit.edu/book-home.tcl?isbn=0262631296 (Macintosh)
Comments from readers of comp.ai.neural-nets: "Written in a tutorial
style, and includes 2 diskettes of NN simulation programs that can be
compiled on MS-DOS or Unix (and they do too !)"; "The programs are
pretty reasonable as an introduction to some of the things that NNs can
do."; "There are *two* editions of this book. One comes with disks for
the IBM PC, the other comes with disks for the Macintosh".
McCord Nelson, M. and Illingworth, W.T. (1990). A Practical Guide to Neural
Nets. Addison-Wesley Publishing Company, Inc. (ISBN 0-201-52376-0).
Book Webpage (Publisher):
http://cseng.aw.com/bookdetail.qry?ISBN=0-201-63378-7&ptype=1174
Lots of applications without technical details, lots of hype,
lots of goofs, no formulas.
Muller, B., Reinhardt, J., Strickland, M. T. (1995). Neural
Networks.:An Introduction (2nd ed.). Berlin, Heidelberg, New York:
Springer-Verlag. ISBN 3-540-60207-0. (DOS 3.5" disk included.)
Book Webpage (Publisher):
http://www.springer.de/catalog/html-files/deutsch/phys/3540602070.html
Comments from readers of comp.ai.neural-nets: "The book was developed
out of a course on neural-network models with computer demonstrations
that was taught by the authors to Physics students. The book comes
together with a PC-diskette. The book is divided into three parts: (1)
Models of Neural Networks; describing several architectures and learing
rules, including the mathematics. (2) Statistical Physics of Neural
Networks; "hard-core" physics section developing formal theories of
stochastic neural networks. (3) Computer Codes; explanation about the
demonstration programs. First part gives a nice introduction into
neural networks together with the formulas. Together with the
demonstration programs a 'feel' for neural networks can be developed."
Orchard, G.A. & Phillips, W.A. (1991). Neural Computation: A
Beginner's Guide. Lawrence Earlbaum Associates: London.
Comments from readers of comp.ai.neural-nets: "Short user-friendly
introduction to the area, with a non-technical flavour. Apparently
accompanies a software package, but I haven't seen that yet".
Rao, V.B, and Rao, H.V. (1993). C++ Neural Networks and Fuzzy Logic.
MIS:Press, ISBN 1-55828-298-x, US $45 incl. disks.
Covers a wider variety of networks than Masters (1993),
but is shallow and lacks Masters's insight into practical
issues of using NNs.
Wasserman, P. D. (1989). Neural Computing: Theory & Practice.
Van Nostrand Reinhold: New York. (ISBN 0-442-20743-3)
This is not as bad as some books on NNs. It provides an elementary
account of the mechanics of a variety of networks. But it provides
no insight into why various methods behave as they do, or under
what conditions a method will or will not work well. It has no
discussion of efficient training methods such as RPROP or
conventional numerical optimization techniques. And, most egregiously,
it has no explanation of overfitting and generalization beyond the
patently false statement on p. 2 that "It is important to note that
the artificial neural network generalizes automatically as a result
of its structure"! There is no mention of training, validation, and
test sets, or of other methods for estimating generalization error.
There is no practical advice on the important issue of choosing the
number of hidden units. There is no discussion of early stopping
or weight decay. The reader will come away from this book with a
grossly oversimplified view of NNs and no concept whatsoever of how
to use NNs for practical applications.
Comments from readers of comp.ai.neural-nets: "Wasserman flatly enumerates some common architectures from an engineer's perspective ('how it works') without ever addressing the underlying fundamentals ('why it works') - important basic concepts such as clustering, principal components or gradient descent are not treated. It's also full of errors, and unhelpful diagrams drawn with what appears to be PCB board layout software from the '70s. For anyone who wants to do active research in the field I consider it quite inadequate"; "Okay, but too shallow"; "Quite easy to understand"; "The best bedtime reading for Neural Networks. I have given this book to numerous collegues who want to know NN basics, but who never plan to implement anything. An excellent book to give your manager."
Book Webpage (Publisher):
http://www.prenhall.com/books/ptr_0136123260.html
Levine, D. S. (2000). Introduction to Neural and Cognitive Modeling.
2nd ed., Lawrence Erlbaum: Hillsdale, N.J.
Comments from readers of comp.ai.neural-nets: "Highly recommended".
Maren, A., Harston, C. and Pap, R., (1990). Handbook of Neural Computing
Applications. Academic Press. ISBN: 0-12-471260-6. (451 pages)
Comments from readers of comp.ai.neural-nets: "They cover a broad area";
"Introductory with suggested applications implementation".
Pao, Y. H. (1989). Adaptive Pattern Recognition and Neural Networks
Addison-Wesley Publishing Company, Inc. (ISBN 0-201-12584-6)
Book Webpage (Publisher): http://www.awl.com/
Comments from readers of comp.ai.neural-nets: "An excellent book that
ties together classical approaches to pattern recognition with Neural
Nets. Most other NN books do not even mention conventional
approaches."
Refenes, A. (Ed.) (1995). Neural Networks in the Capital Markets.
Chichester, England: John Wiley and Sons, Inc.
Book Webpage (Publisher): http://www.wiley.com/
Additional Information: One has to search.
Franco Insana comments:
* Not for the beginner * Excellent introductory material presented by editor in first 5 chapters, which could be a valuable reference source for any practitioner * Very thought-provoking * Mostly backprop-related * Most contributors lay good statistical foundation * Overall, a wealth of information and ideas, but the reader has to sift through it all to come away with anything useful
Simpson, P. K. (1990). Artificial Neural Systems: Foundations, Paradigms,
Applications and Implementations. Pergamon Press: New York.
Comments from readers of comp.ai.neural-nets: "Contains a very useful 37
page bibliography. A large number of paradigms are presented. On the
negative side the book is very shallow. Best used as a complement to
other books".
Wasserman, P.D. (1993). Advanced Methods in Neural Computing.
Van Nostrand Reinhold: New York (ISBN: 0-442-00461-3).
Comments from readers of comp.ai.neural-nets: "Several neural network
topics are discussed e.g. Probalistic Neural Networks, Backpropagation
and beyond, neural control, Radial Basis Function Networks, Neural
Engineering. Furthermore, several subjects related to neural networks
are mentioned e.g. genetic algorithms, fuzzy logic, chaos. Just the
functionality of these subjects is described; enough to get you started.
Lots of references are given to more elaborate descriptions. Easy to
read, no extensive mathematical background necessary."
Zeidenberg. M. (1990). Neural Networks in Artificial Intelligence.
Ellis Horwood, Ltd., Chichester.
Comments from readers of comp.ai.neural-nets: "Gives the AI point of
view".
Zornetzer, S. F., Davis, J. L. and Lau, C. (1990). An Introduction to
Neural and Electronic Networks. Academic Press. (ISBN 0-12-781881-2)
Comments from readers of comp.ai.neural-nets: "Covers quite a broad
range of topics (collection of articles/papers )."; "Provides a
primer-like introduction and overview for a broad audience, and employs
a strong interdisciplinary emphasis".
Zurada, Jacek M. (1992). Introduction To Artificial Neural Systems.
Hardcover, 785 Pages, 317 Figures, ISBN 0-534-95460-X, 1992, PWS Publishing
Company, Price: $56.75 (includes shipping, handling, and the ANS software
diskette). Solutions Manual available.
Comments from readers of comp.ai.neural-nets: "Cohesive and
comprehensive book on neural nets; as an engineering-oriented
introduction, but also as a research foundation. Thorough exposition of
fundamentals, theory and applications. Training and recall algorithms
appear in boxes showing steps of algorithms, thus making programming of
learning paradigms easy. Many illustrations and intuitive examples.
Winner among NN textbooks at a senior UG/first year graduate level-[175
problems]." Contents: Intro, Fundamentals of Learning, Single-Layer &
Multilayer Perceptron NN, Assoc. Memories, Self-organizing and Matching
Nets, Applications, Implementations, Appendix)
Mr Blum has not only contributed a masterpiece of NN inaccuracy but also seems to lack a fundamental understanding of Object Orientation.
The excessive use of virtual methods (see page 32 for example), the inclusion of unnecessary 'friend' relationships (page 133) and a penchant for operator overloading (pick a page!) demonstrate inability in C++ and/or OO.
The introduction to OO that is provided trivialises the area and demonstrates a distinct lack of direction and/or understanding.
The public interfaces to classes are overspecified and the design relies upon the flawed neuron/layer/network model.
There is a notable disregard for any notion of a robust class hierarchy which is demonstrated by an almost total lack of concern for inheritance and associated reuse strategies.
The attempt to rationalise differing types of Neural Network into a single very shallow but wide class hierarchy is naive.
The general use of the 'float' data type would cause serious hassle if this software could possibly be extended to use some of the more sensitive variants of backprop on more difficult problems. It is a matter of great fortune that such software is unlikely to be reusable and will therefore, like all good dinosaurs, disappear with the passage of time.
The irony is that there is a card in the back of the book asking the unfortunate reader to part with a further $39.95 for a copy of the software (already included in print) on a 5.25" disk.
The author claims that his work provides an 'Object Oriented Framework ...'. This can best be put in his own terms (Page 137):
... garble(float noise) ...
Before attempting to review the code associated with this book it should be clearly stated that it is supplied as an extra--almost as an afterthought. This may be a wise move.
Although not as bad as other (even commercial) implementations, the code provided lacks proper OO structure and is typical of C++ written in a C style.
Style criticisms include:
In a generous sense the code is free and the author doesn't claim any expertise in software engineering. It works in a limited sense but would be difficult to extend and/or reuse. It's fine for demonstration purposes in a stand-alone manner and for use with the book concerned.
If you're serious about nets you'll end up rewriting the whole lot (or getting something better).
Blum, Adam (1992), Neural Networks in C++, NY: Wiley.
Welstead, Stephen T. (1994), Neural Network and Fuzzy Logic Applications in C/C++, NY: Wiley.
Both Blum and Welstead contribute to the dangerous myth that any idiot can use a neural net by dumping in whatever data are handy and letting it train for a few days. They both have little or no discussion of generalization, validation, and overfitting. Neither provides any valid advice on choosing the number of hidden nodes. If you have ever wondered where these stupid "rules of thumb" that pop up frequently come from, here's a source for one of them:
"A rule of thumb is for the size of this [hidden] layer to be somewhere between the input layer size ... and the output layer size ..." Blum, p. 60.(John Lazzaro tells me he recently "reviewed a paper that cited this rule of thumb--and referenced this book! Needless to say, the final version of that paper didn't include the reference!")
Blum offers some profound advice on choosing inputs:
"The next step is to pick as many input factors as possible that might be related to [the target]."Blum also shows a deep understanding of statistics:
"A statistical model is simply a more indirect way of learning correlations. With a neural net approach, we model the problem directly." p. 8.Blum at least mentions some important issues, however simplistic his advice may be. Welstead just ignores them. What Welstead gives you is code--vast amounts of code. I have no idea how anyone could write that much code for a simple feedforward NN. Welstead's approach to validation, in his chapter on financial forecasting, is to reserve two cases for the validation set!
My comments apply only to the text of the above books. I have not examined or attempted to compile the code.
Swingler, K. (1996), Applying Neural Networks: A Practical Guide, London: Academic Press.
This book has lots of good advice liberally sprinkled with errors, incorrect formulas, some bad advice, and some very serious mistakes. Experts will learn nothing, while beginners will be unable to separate the useful information from the dangerous. For example, there is a chapter on "Data encoding and re-coding" that would be very useful to beginners if it were accurate, but the formula for the standard deviation is wrong, and the description of the softmax function is of something entirely different than softmax (see What is a softmax activation function?). Even more dangerous is the statement on p. 28 that "Any pair of variables with high covariance are dependent, and one may be chosen to be discarded." Although high correlations can be used to identify redundant inputs, it is incorrect to use high covariances for this purpose, since a covariance can be high simply because one of the inputs has a high standard deviation.
The most ludicrous thing I've found in the book is the claim that Hecht-Neilsen used Kolmogorov's theorem to show that "you will never require more than twice the number of hidden units as you have inputs" (p. 53) in an MLP with one hidden layer. Actually, Hecht-Neilsen, says "the direct usefulness of this result is doubtful, because no constructive method for developing the [output activation] functions is known." Then Swingler implies that V. Kurkova (1991, "Kolmogorov's theorem is relevant," Neural Computation, 3, 617-622) confirmed this alleged upper bound on the number of hidden units, saying that, "Kurkova was able to restate Kolmogorov's theorem in terms of a set of sigmoidal functions." If Kolmogorov's theorem, or Hecht-Nielsen's adaptation of it, could be restated in terms of known sigmoid activation functions in the (single) hidden and output layers, then Swingler's alleged upper bound would be correct, but in fact no such restatement of Kolmogorov's theorem is possible, and Kurkova did not claim to prove any such restatement. Swingler omits the crucial details that Kurkova used two hidden layers, staircase-like activation functions (not ordinary sigmoidal functions such as the logistic) in the first hidden layer, and a potentially large number of units in the second hidden layer. Kurkova later estimated the number of units required for uniform approximation within an error epsilon as nm(m+1) in the first hidden layer and m^2(m+1)^n in the second hidden layer, where n is the number of inputs and m "depends on epsilon/||f|| as well as on the rate with which f increases distances." In other words, Kurkova says nothing to support Swinglers advice (repeated on p. 55), "Never choose h to be more than twice the number of input units." Furthermore, constructing a counter example to Swingler's advice is trivial: use one input and one output, where the output is the sine of the input, and the domain of the input extends over many cycles of the sine wave; it is obvious that many more than two hidden units are required. For some sound information on choosing the number of hidden units, see How many hidden units should I use?
Choosing the number of hidden units is one important aspect of getting good generalization, which is the most crucial issue in neural network training. There are many other considerations involved in getting good generalization, and Swingler makes several more mistakes in this area:
Swingler addresses many important practical issues, and often provides good practical advice. But the peculiar combination of much good advice with some extremely bad advice, a few examples of which are provided above, could easily seduce a beginner into thinking that the book as a whole is reliable. It is this danger that earns the book a place in "The Worst" list.
Dewdney, A.K. (1997), Yes, We Have No Neutrons: An Eye-Opening Tour through the Twists and Turns of Bad Science, NY: Wiley.
------------------------------------------------------------------------
[to be added: comments on speed of reviewing and publishing, whether they accept TeX format or ASCII by e-mail, etc.]
Title: Neural Networks Publish: Pergamon Press Address: Pergamon Journals Inc., Fairview Park, Elmsford, New York 10523, USA and Pergamon Journals Ltd. Headington Hill Hall, Oxford OX3, 0BW, England Freq.: 10 issues/year (vol. 1 in 1988) Cost/Yr: Free with INNS or JNNS or ENNS membership ($45?), Individual $65, Institution $175 ISSN #: 0893-6080 URL: http://www.elsevier.nl/locate/inca/841 Remark: Official Journal of International Neural Network Society (INNS), European Neural Network Society (ENNS) and Japanese Neural Network Society (JNNS). Contains Original Contributions, Invited Review Articles, Letters to Editor, Book Reviews, Editorials, Announcements, Software Surveys. Title: Neural Computation Publish: MIT Press Address: MIT Press Journals, 55 Hayward Street Cambridge, MA 02142-9949, USA, Phone: (617) 253-2889 Freq.: Quarterly (vol. 1 in 1989) Cost/Yr: Individual $45, Institution $90, Students $35; Add $9 Outside USA ISSN #: 0899-7667 URL: http://mitpress.mit.edu/journals-legacy.tcl Remark: Combination of Reviews (10,000 words), Views (4,000 words) and Letters (2,000 words). I have found this journal to be of outstanding quality. (Note: Remarks supplied by Mike Plonski "plonski@aero.org") Title: NEURAL COMPUTING SURVEYS Publish: Lawrence Erlbaum Associates Address: 10 Industrial Avenue, Mahwah, NJ 07430-2262, USA Freq.: Yearly Cost/Yr: Free on-line ISSN #: 1093-7609 URL: http://www.icsi.berkeley.edu/~jagota/NCS/ Remark: One way to cope with the exponential increase in the number of articles published in recent years is to ignore most of them. A second, perhaps more satisfying, approach is to provide a forum that encourages the regular production -- and perusal -- of high-quality survey articles. This is especially useful in an inter-disciplinary, evolving field such as neural networks. This journal aims to bring the second view-point to bear. It is intended to * encourage researchers to write good survey papers. * motivate researchers to look here first to check what's known on an unfamiliar topic. Title: IEEE Transactions on Neural Networks Publish: Institute of Electrical and Electronics Engineers (IEEE) Address: IEEE Service Cemter, 445 Hoes Lane, P.O. Box 1331, Piscataway, NJ, 08855-1331 USA. Tel: (201) 981-0060 Cost/Yr: $10 for Members belonging to participating IEEE societies Freq.: Quarterly (vol. 1 in March 1990) URL: http://www.ieee.org/nnc/pubs/transactions.html Remark: Devoted to the science and technology of neural networks which disclose significant technical knowledge, exploratory developments and applications of neural networks from biology to software to hardware. Emphasis is on artificial neural networks. Specific aspects include self organizing systems, neurobiological connections, network dynamics and architecture, speech recognition, electronic and photonic implementation, robotics and controls. Includes Letters concerning new research results. (Note: Remarks are from journal announcement) Title: IEEE Transactions on Evolutionary Computation Publish: Institute of Electrical and Electronics Engineers (IEEE) Address: IEEE Service Cemter, 445 Hoes Lane, P.O. Box 1331, Piscataway, NJ, 08855-1331 USA. Tel: (201) 981-0060 Cost/Yr: $10 for Members belonging to participating IEEE societies Freq.: Quarterly (vol. 1 in May 1997) URL: http://engine.ieee.org/nnc/pubs/transactions.html Remark: The IEEE Transactions on Evolutionary Computation will publish archival journal quality original papers in evolutionary computation and related areas, with particular emphasis on the practical application of the techniques to solving real problems in industry, medicine, and other disciplines. Specific techniques include but are not limited to evolution strategies, evolutionary programming, genetic algorithms, and associated methods of genetic programming and classifier systems. Papers emphasizing mathematical results should ideally seek to put these results in the context of algorithm design, however purely theoretical papers will be considered. Other papers in the areas of cultural algorithms, artificial life, molecular computing, evolvable hardware, and the use of simulated evolution to gain a better understanding of naturally evolved systems are also encouraged. (Note: Remarks are from journal CFP) Title: International Journal of Neural Systems Publish: World Scientific Publishing Address: USA: World Scientific Publishing Co., 1060 Main Street, River Edge, NJ 07666. Tel: (201) 487 9655; Europe: World Scientific Publishing Co. Ltd., 57 Shelton Street, London WC2H 9HE, England. Tel: (0171) 836 0888; Asia: World Scientific Publishing Co. Pte. Ltd., 1022 Hougang Avenue 1 #05-3520, Singapore 1953, Rep. of Singapore Tel: 382 5663. Freq.: Quarterly (Vol. 1 in 1990) Cost/Yr: Individual $122, Institution $255 (plus $15-$25 for postage) ISSN #: 0129-0657 (IJNS) Remark: The International Journal of Neural Systems is a quarterly journal which covers information processing in natural and artificial neural systems. Contributions include research papers, reviews, and Letters to the Editor - communications under 3,000 words in length, which are published within six months of receipt. Other contributions are typically published within nine months. The journal presents a fresh undogmatic attitude towards this multidisciplinary field and aims to be a forum for novel ideas and improved understanding of collective and cooperative phenomena with computational capabilities. Papers should be submitted to World Scientific's UK office. Once a paper is accepted for publication, authors are invited to e-mail the LaTeX source file of their paper in order to expedite publication. Title: International Journal of Neurocomputing Publish: Elsevier Science Publishers, Journal Dept.; PO Box 211; 1000 AE Amsterdam, The Netherlands Freq.: Quarterly (vol. 1 in 1989) URL: http://www.elsevier.nl/locate/inca/505628 Title: Neural Processing Letters Publish: Kluwer Academic publishers Address: P.O. Box 322, 3300 AH Dordrecht, The Netherlands Freq: 6 issues/year (vol. 1 in 1994) Cost/Yr: Individuals $198, Institution $400 (including postage) ISSN #: 1370-4621 URL: http://www.wkap.nl/journalhome.htm/1370-4621 Remark: The aim of the journal is to rapidly publish new ideas, original developments and work in progress. Neural Processing Letters covers all aspects of the Artificial Neural Networks field. Publication delay is about 3 months. Title: Neural Network News Publish: AIWeek Inc. Address: Neural Network News, 2555 Cumberland Parkway, Suite 299, Atlanta, GA 30339 USA. Tel: (404) 434-2187 Freq.: Monthly (beginning September 1989) Cost/Yr: USA and Canada $249, Elsewhere $299 Remark: Commercial Newsletter Title: Network: Computation in Neural Systems Publish: IOP Publishing Ltd Address: Europe: IOP Publishing Ltd, Techno House, Redcliffe Way, Bristol BS1 6NX, UK; IN USA: American Institute of Physics, Subscriber Services 500 Sunnyside Blvd., Woodbury, NY 11797-2999 Freq.: Quarterly (1st issue 1990) Cost/Yr: USA: $180, Europe: 110 pounds URL: http://www.iop.org/Journals/ne Remark: Description: "a forum for integrating theoretical and experimental findings across relevant interdisciplinary boundaries." Contents: Submitted articles reviewed by two technical referees paper's interdisciplinary format and accessability." Also Viewpoints and Reviews commissioned by the editors, abstracts (with reviews) of articles published in other journals, and book reviews. Comment: While the price discourages me (my comments are based upon a free sample copy), I think that the journal succeeds very well. The highest density of interesting articles I have found in any journal. (Note: Remarks supplied by kehoe@csufres.CSUFresno.EDU) Title: Connection Science: Journal of Neural Computing, Artificial Intelligence and Cognitive Research Publish: Carfax Publishing Address: Europe: Carfax Publishing Company, PO Box 25, Abingdon, Oxfordshire OX14 3UE, UK. USA: Carfax Publishing Company, PO Box 2025, Dunnellon, Florida 34430-2025, USA Australia: Carfax Publishing Company, Locked Bag 25, Deakin, ACT 2600, Australia Freq.: Quarterly (vol. 1 in 1989) Cost/Yr: Personal rate: 48 pounds (EC) 66 pounds (outside EC) US$118 (USA and Canada) Institutional rate: 176 pounds (EC) 198 pounds (outside EC) US$340 (USA and Canada) Title: International Journal of Neural Networks Publish: Learned Information Freq.: Quarterly (vol. 1 in 1989) Cost/Yr: 90 pounds ISSN #: 0954-9889 Remark: The journal contains articles, a conference report (at least the issue I have), news and a calendar. (Note: remark provided by J.R.M. Smits "anjos@sci.kun.nl") Title: Sixth Generation Systems (formerly Neurocomputers) Publish: Gallifrey Publishing Address: Gallifrey Publishing, PO Box 155, Vicksburg, Michigan, 49097, USA Tel: (616) 649-3772, 649-3592 fax Freq. Monthly (1st issue January, 1987) ISSN #: 0893-1585 Editor: Derek F. Stubbs Cost/Yr: $79 (USA, Canada), US$95 (elsewhere) Remark: Runs eight to 16 pages monthly. In 1995 will go to floppy disc-based publishing with databases +, "the equivalent to 50 pages per issue are planned." Often focuses on specific topics: e.g., August, 1994 contains two articles: "Economics, Times Series and the Market," and "Finite Particle Analysis - [part] II." Stubbs also directs the company Advanced Forecasting Technologies. (Remark by Ed Rosenfeld: ier@aol.com) Title: JNNS Newsletter (Newsletter of the Japan Neural Network Society) Publish: The Japan Neural Network Society Freq.: Quarterly (vol. 1 in 1989) Remark: (IN JAPANESE LANGUAGE) Official Newsletter of the Japan Neural Network Society(JNNS) (Note: remarks by Osamu Saito "saito@nttica.NTT.JP") Title: Neural Networks Today Remark: I found this title in a bulletin board of october last year. It was a message of Tim Pattison, timpatt@augean.OZ (Note: remark provided by J.R.M. Smits "anjos@sci.kun.nl") Title: Computer Simulations in Brain Science Title: Internation Journal of Neuroscience Title: Neural Network Computation Remark: Possibly the same as "Neural Computation" Title: Neural Computing and Applications Freq.: Quarterly Publish: Springer Verlag Cost/yr: 120 Pounds Remark: Is the journal of the Neural Computing Applications Forum. Publishes original research and other information in the field of practical applications of neural computing.
Title: Biological Cybernetics (Kybernetik) Publish: Springer Verlag Remark: Monthly (vol. 1 in 1961) Title: Various IEEE Transactions and Magazines Publish: IEEE Remark: Primarily see IEEE Trans. on System, Man and Cybernetics; Various Special Issues: April 1990 IEEE Control Systems Magazine.; May 1989 IEEE Trans. Circuits and Systems.; July 1988 IEEE Trans. Acoust. Speech Signal Process. Title: The Journal of Experimental and Theoretical Artificial Intelligence Publish: Taylor & Francis, Ltd. Address: London, New York, Philadelphia Freq.: ? (1st issue Jan 1989) Remark: For submission information, please contact either of the editors: Eric Dietrich Chris Fields PACSS - Department of Philosophy Box 30001/3CRL SUNY Binghamton New Mexico State University Binghamton, NY 13901 Las Cruces, NM 88003-0001 dietrich@bingvaxu.cc.binghamton.edu cfields@nmsu.edu Title: The Behavioral and Brain Sciences Publish: Cambridge University Press Remark: (Remarks by Don WunschThis is a delightful journal that encourages discussion on a variety of controversial topics. I have especially enjoyed reading some papers in there by Dana Ballard and Stephen Grossberg (separate papers, not collaborations) a few years back. They have a really neat concept: they get a paper, then invite a number of noted scientists in the field to praise it or trash it. They print these commentaries, and give the author(s) a chance to make a rebuttal or concurrence. Sometimes, as I'm sure you can imagine, things get pretty lively. Their reviewers are called something like Behavioral and Brain Associates, and I believe they have to be nominated by current associates, and should be fairly well established in the field. The main thing is that I liked the articles I read. Title: International Journal of Applied Intelligence Publish: Kluwer Academic Publishers Remark: first issue in 1990(?) Title: International Journal of Modern Physics C Publish: USA: World Scientific Publishing Co., 1060 Main Street, River Edge, NJ 07666. Tel: (201) 487 9655; Europe: World Scientific Publishing Co. Ltd., 57 Shelton Street, London WC2H 9HE, England. Tel: (0171) 836 0888; Asia: World Scientific Publishing Co. Pte. Ltd., 1022 Hougang Avenue 1 #05-3520, Singapore 1953, Rep. of Singapore Tel: 382 5663. Freq: bi-monthly Eds: H. Herrmann, R. Brower, G.C. Fox and S Nose Title: Machine Learning Publish: Kluwer Academic Publishers Address: Kluwer Academic Publishers P.O. Box 358 Accord Station Hingham, MA 02018-0358 USA Freq.: Monthly (8 issues per year; increasing to 12 in 1993) Cost/Yr: Individual $140 (1992); Member of AAAI or CSCSI $88 Remark: Description: Machine Learning is an international forum for research on computational approaches to learning. The journal publishes articles reporting substantive research results on a wide range of learning methods applied to a variety of task domains. The ideal paper will make a theoretical contribution supported by a computer implementation. The journal has published many key papers in learning theory, reinforcement learning, and decision tree methods. Recently it has published a special issue on connectionist approaches to symbolic reasoning. The journal regularly publishes issues devoted to genetic algorithms as well. Title: INTELLIGENCE - The Future of Computing Published by: Intelligence Address: INTELLIGENCE, P.O. Box 20008, New York, NY 10025-1510, USA, 212-222-1123 voice & fax; email: ier@aol.com, CIS: 72400,1013 Freq. Monthly plus four special reports each year (1st issue: May, 1984) ISSN #: 1042-4296 Editor: Edward Rosenfeld Cost/Yr: $395 (USA), US$450 (elsewhere) Remark: Has absorbed several other newsletters, like Synapse/Connection and Critical Technology Trends (formerly AI Trends). Covers NN, genetic algorithms, fuzzy systems, wavelets, chaos and other advanced computing approaches, as well as molecular computing and nanotechnology. Title: Journal of Physics A: Mathematical and General Publish: Inst. of Physics, Bristol Freq: 24 issues per year. Remark: Statistical mechanics aspects of neural networks (mostly Hopfield models). Title: Physical Review A: Atomic, Molecular and Optical Physics Publish: The American Physical Society (Am. Inst. of Physics) Freq: Monthly Remark: Statistical mechanics of neural networks. Title: Information Sciences Publish: North Holland (Elsevier Science) Freq.: Monthly ISSN: 0020-0255 Editor: Paul P. Wang; Department of Electrical Engineering; Duke University; Durham, NC 27706, USA
------------------------------------------------------------------------
------------------------------------------------------------------------
------------------------------------------------------------------------
The Machine Learning mailing list is an unmoderated mailing list intended for people in Computer Sciences, Statistics, Mathematics, and other areas or disciplines with interests in Machine Learning. Researchers, practitioners, and users of Machine Learning in academia, industry, and government are encouraged to join the list to discuss and exchange ideas regarding any aspect of Machine Learning, e.g., various learning algorithms, data pre-processing, variable selection mechanism, instance selection, and applications to real-world problems.
You can post, read, and reply messages on the Web. Or you can choose to receive messages as individual emails, daily summaries, daily full-text digest, or read them on the Web only.
CONNECTIONISTS is a moderated mailing list for discussion of technical issues relating to neural computation, and dissemination of professional announcements such as calls for papers, book announcements, and electronic preprints. CONNECTIONISTS is focused on meeting the needs of active researchers in the field, not on answering questions from beginners.
URL: ftp://www.centralneuralsystem.com/pub/CNS/bbs Supported by: Wesley R. Elsberry 3027 Macaulay Street San Diego, CA 92106 Email: welsberr@inia.cls.org Alternative URL: http://www.cs.cmu.edu/afs/cs.cmu.edu/project/ai-repository/ai/areas/neural/cns/0.htmlMany MS-DOS PD and shareware simulations, source code, benchmarks, demonstration packages, information files; some Unix, Macintosh, Amiga related files. Also available are files on AI, AI Expert listings 1986-1991, fuzzy logic, genetic algorithms, artificial life, evolutionary biology, and many Project Gutenberg and Wiretap e-texts.
Network Cybernetics Corporation; 4201 Wingren Road Suite 202; Irving, TX 75062-2763; Tel 214/650-2002; Fax 214/650-1929;The cost is $129 per disc + shipping ($5/disc domestic or $10/disc foreign) (See the comp.ai FAQ for further details)
------------------------------------------------------------------------
Benchmark studies require some familiarity with the statistical design and analysis of experiments. There are many textbooks on this subject, of which Cohen (1995) will probably be of particular interest to researchers in neural nets and machine learning (see also the review of Cohen's book by Ron Kohavi in the International Journal of Neural Systems, which can be found on-line at http://robotics.stanford.edu/users/ronnyk/ronnyk-bib.html).
Reference:
Cohen, P.R. (1995), Empirical Methods for Artificial Intelligence, Cambridge, MA: The MIT Press.
------------------------------------------------------------------------
The system requirements for all databases are a 5.25" CD-ROM drive with software to read ISO-9660 format. Contact: Darrin L. Dimmick; dld@magi.ncsl.nist.gov; (301)975-4147
The prices of the databases are between US$ 250 and 1895 If you wish to order a database, please contact: Standard Reference Data; National Institute of Standards and Technology; 221/A323; Gaithersburg, MD 20899; Phone: (301)975-2208; FAX: (301)926-0416
Samples of the data can be found by ftp on sequoyah.ncsl.nist.gov in directory /pub/data A more complete description of the available databases can be obtained from the same host as /pub/databases/catalog.txt
Specifications of the database include: + 300 ppi 8-bit grayscale handwritten words (cities, states, ZIP Codes) o 5632 city words o 4938 state words o 9454 ZIP Codes + 300 ppi binary handwritten characters and digits: o 27,837 mixed alphas and numerics segmented from address blocks o 21,179 digits segmented from ZIP Codes + every image supplied with a manually determined truth value + extracted from live mail in a working U.S. Post Office + word images in the test set supplied with dic- tionaries of postal words that simulate partial recognition of the corresponding ZIP Code. + digit images included in test set that simulate automatic ZIP Code segmentation. Results on these data can be projected to overall ZIP Code recogni- tion performance. + image format documentation and software includedSystem requirements are a 5.25" CD-ROM drive with software to read ISO-9660 format. For further information, see http://www.cedar.buffalo.edu/Databases/CDROM1/ or send email to Ajay Shekhawat at <ajay@cedar.Buffalo.EDU>
There is also a CEDAR CDROM-2, a database of machine-printed Japanese character images.
Some of the datasets were used in a prediction contest and are described in detail in the book "Time series prediction: Forecasting the future and understanding the past", edited by Weigend/Gershenfield, Proceedings Volume XV in the Santa Fe Institute Studies in the Sciences of Complexity series of Addison Wesley (1994).
The numbers of series of various types are given in the following table:
Interval Micro Industry Macro Finance Demog Other Total Yearly 146 102 83 58 245 11 645 Quarterly 204 83 336 76 57 0 756 Monthly 474 334 312 145 111 52 1428 Other 4 0 0 29 0 141 174 Total 828 519 731 308 413 204 3003
http://www.chdwk.com/data/index.html
For further information, see http://facesaver.usenix.org/
According to the archive administrator, Barbara L. Dijker
(barb.dijker@labyrinth.com), there is no restriction to use them.
However, the image files are stored in separate directories
corresponding to the Internet site to which the person represented in
the image belongs, with each directory containing a small number of
images (two in the average). This makes it difficult to retrieve by ftp
even a small part of the database, as you have to get each one
individually.
A solution, as Barbara proposed me, would be to compress the whole set
of images (in separate files of, say, 100 images) and maintain them as a
specific archive for research on face processing, similar to the ones
that already exist for fingerprints and others. The whole compressed
database would take some 30 megabytes of disk space. I encourage anyone
willing to host this database in his/her site, available for anonymous
ftp, to contact her for details (unfortunately I don't have the
resources to set up such a site).
Please consider that UUNET has graciously provided the ftp server for the FaceSaver archive and may discontinue that service if it becomes a burden. This means that people should not download more than maybe 10 faces at a time from uunet.
A last remark: each file represents a different person (except for isolated cases). This makes the database quite unsuitable for training neural networks, since for proper generalisation several instances of the same subject are required. However, it is still useful for use as testing set on a trained network.
Linguistic Data Consortium University of Pennsylvania 3615 Market Street, Suite 200 Philadelphia, PA 19104-2608 Tel (215) 898-0464 Fax (215) 573-2175 Email: ldc@ldc.upenn.edu
CityU Image Processing Lab: http://www.image.cityu.edu.hk/images/database.html
Center for Image Processing Research: http://cipr.rpi.edu/
Computer Vision Test Images: http://www.cs.cmu.edu:80/afs/cs/project/cil/ftp/html/v-images.html
Lenna 97: A Complete Story of Lenna: http://www.image.cityu.edu.hk/images/lenna/Lenna97.html
------------------------------------------------------------------------Next part is part 5 (of 7). Previous part is part 3.