000 03956cam a22003617a 4500
008 130227s2012 caua b 001 0 eng
010 _a 2012277057
016 7 _a015952116
_2Uk
020 _a9789350236741
020 _a1449303714
035 _a(OCoLC)ocn783384312
042 _alccopycat
082 0 4 _a005.1 MAC-C
_223
100 1 _aConway, Drew.
245 1 0 _aMachine learning for hackers /
_cDrew Conway and John Myles White.
250 _a1st ed.
260 _aSebastopol, CA :
_bO'Reilly Media,
_c2012.
500 _a"Case studies and algorithms to get you started"--Cover.
504 _aIncludes bibliographical references (p. 293-294) and index.
505 0 _aMachine generated contents note: 1. Using R -- R for Machine Learning -- Downloading and Installing R -- IDEs and Text Editors -- Loading and Installing R Packages -- R Basics for Machine Learning -- Further Reading on R -- 2. Data Exploration -- Exploration versus Confirmation -- What Is Data? -- Inferring the Types of Columns in Your Data -- Inferring Meaning -- Numeric Summaries -- Means, Medians, and Modes -- Quantiles -- Standard Deviations and Variances -- Exploratory Data Visualization -- Visualizing the Relationships Between Columns -- 3. Classification: Spam Filtering -- This or That: Binary Classification -- Moving Gently into Conditional Probability -- Writing Our First Bayesian Spam Classifier -- Defining the Classifier and Testing It with Hard Ham -- Testing the Classifier Against All Email Types -- Improving the Results -- 4. Ranking: Priority Inbox -- How Do You Sort Something When You Don't Know the Order? -- Ordering Email Messages by Priority.
505 0 _aContents note continued: Priority Features of Email -- Writing a Priority Inbox -- Functions for Extracting the Feature Set -- Creating a Weighting Scheme for Ranking -- Weighting from Email Thread Activity -- Training and Testing the Ranker -- 5. Regression: Predicting Page Views -- Introducing Regression -- The Baseline Model -- Regression Using Dummy Variables -- Linear Regression in a Nutshell -- Predicting Web Traffic -- Defining Correlation -- 6. Regularization: Text Regression -- Nonlinear Relationships Between Columns: Beyond Straight Lines -- Introducing Polynomial Regression -- Methods for Preventing Overfitting -- Preventing Overfitting with Regularization -- Text Regression -- Logistic Regression to the Rescue -- 7. Optimization: Breaking Codes -- Introduction to Optimization -- Ridge Regression -- Code Breaking as Optimization -- 8. PCA: Building a Market Index -- Unsupervised Learning -- 9. MDS: Visually Exploring US Senator Similarity.
505 0 _aContents note continued: Clustering Based on Similarity -- A Brief Introduction to Distance Metrics and Multidirectional Scaling -- How Do US Senators Cluster? -- Analyzing US Senator Roll Call Data (101st--111th Congresses) -- 10. kNN: Recommendation Systems -- The k-Nearest Neighbors Algorithm -- R Package Installation Data -- 11. Analyzing Social Graphs -- Social Network Analysis -- Thinking Graphically -- Hacking Twitter Social Graph Data -- Working with the Google SocialGraph API -- Analyzing Twitter Networks -- Local Community Structure -- Visualizing the Clustered Twitter Network with Gephi -- Building Your Own "Who to Follow" Engine -- 12. Model Comparison -- SVMs: The Support Vector Machine -- Comparing Algorithms.
650 0 _aComputer algorithms.
650 0 _aElectronic data processing
_xAutomation.
700 1 _aWhite, John Myles.
856 4 2 _uhttp://www.loc.gov/catdir/enhancements/fy1307/2012277057-b.html
856 4 2 _uhttp://www.loc.gov/catdir/enhancements/fy1307/2012277057-d.html
856 4 1 _uhttp://www.loc.gov/catdir/enhancements/fy1307/2012277057-t.html
942 _2ddc
_cTR
906 _a7
_bcbc
_ccopycat
_d2
_encip
_f20
_gy-gencatlg
999 _c22684
_d22684