Programming Collective Intelligence

Programming Collective Intelligence: Building Smart Web 2.0 Applications

Programming Collective Intelligence: Building Smart Web 2.0 Applications


1. Introduction to Collective Intelligence
     What Is Collective Intelligence?
     What Is Machine Learning?
     Limits of Machine Learning
     Real-Life Examples
     Other Uses for Learning Algorithms

2. Making Recommendations
     Collaborative Filtering
     Collecting Preferences
     Finding Similar Users
     Recommending Items
     Matching Products
     Building a Link Recommender
     Item-Based Filtering
     Using the MovieLens Dataset
     User-Based or Item-Based Filtering?

3. Discovering Groups
     Supervised versus Unsupervised Learning
     Word Vectors
     Hierarchical Clustering
     Drawing the Dendrogram
     Column Clustering
     K-Means Clustering
     Clusters of Preferences
     Viewing Data in Two Dimensions
     Other Things to Cluster

4. Searching and Ranking
     What's in a Search Engine?
     A Simple Crawler
     Building the Index
     Content-Based Ranking
     Using Inbound Links
     Learning from Clicks

5. Optimization
     Group Travel
     Representing Solutions
     The Cost Function
     Random Searching
     Hill Climbing
     Simulated Annealing
     Genetic Algorithms
     Real Flight Searches
     Optimizing for Preferences
     Network Visualization
     Other Possibilities

6. Document Filtering
     Filtering Spam
     Documents and Words
     Training the Classifier
     Calculating Probabilities
     A Naïve Classifier
     The Fisher Method
     Persisting the Trained Classifiers
     Filtering Blog Feeds
     Improving Feature Detection
     Using Akismet
     Alternative Methods

7. Modeling with Decision Trees
     Predicting Signups
     Introducing Decision Trees
     Training the Tree
     Choosing the Best Split
     Recursive Tree Building
     Displaying the Tree
     Classifying New Observations
     Pruning the Tree
     Dealing with Missing Data
     Dealing with Numerical Outcomes
     Modeling Home Prices
     Modeling "Hotness"
     When to Use Decision Trees

8. Building Price Models
     Building a Sample Dataset
     k-Nearest Neighbors
     Weighted Neighbors
     Heterogeneous Variables
     Optimizing the Scale
     Uneven Distributions
     Using Real Data-the eBay API
     When to Use k-Nearest Neighbors

9. Advanced Classification: Kernel Methods and SVMs
     Matchmaker Dataset
     Difficulties with the Data
     Basic Linear Classification
     Categorical Features
     Scaling the Data
     Understanding Kernel Methods
     Support-Vector Machines
     Using LIBSVM
     Matching on Facebook

10. Finding Independent Features
     A Corpus of News
     Previous Approaches
     Non-Negative Matrix Factorization
     Displaying the Results
     Using Stock Market Data

11. Evolving Intelligence
     What Is Genetic Programming?
     Programs As Trees
     Creating the Initial Population
     Testing a Solution
     Mutating Programs
     Building the Environment
     A Simple Game
     Further Possibilities

12. Algorithm Summary
     Bayesian Classifier
     Decision Tree Classifier
     Neural Networks
     Support-Vector Machines
     k-Nearest Neighbors
     Multidimensional Scaling
     Non-Negative Matrix Factorization

A. Third-Party Libraries

B. Mathematical Formulas