Normal view MARC view ISBD view

Data mining algorithms : (Record no. 200463775)

MARC details
000 -LEADER
fixed length control field	16938cam a2200709 i 4500
001 - CONTROL NUMBER
control field	891186025
003 - CONTROL NUMBER IDENTIFIER
control field	OCoLC
005 - DATE AND TIME OF LATEST TRANSACTION
control field	20250129142507.0
006 - FIXED-LENGTH DATA ELEMENTS--ADDITIONAL MATERIAL CHARACTERISTICS
fixed length control field	m o d
007 - PHYSICAL DESCRIPTION FIXED FIELD--GENERAL INFORMATION
fixed length control field	cr \|\|\|\|\|\|\|\|\|\|\|
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION
fixed length control field	140922s2015 enk ob 001 0 eng
020 ## - INTERNATIONAL STANDARD BOOK NUMBER
International Standard Book Number	1118950801
Qualifying information	(electronic bk.)

International Standard Book Number	1118950844
Qualifying information	(electronic bk.)

International Standard Book Number	9781118950807
Qualifying information	(electronic bk.)

International Standard Book Number	9781118950845
Qualifying information	(electronic bk.)

Canceled/invalid ISBN	111833258X
Qualifying information	(hardback)

Canceled/invalid ISBN	111895095X

Canceled/invalid ISBN	9781118332580
Qualifying information	(hardback)

Canceled/invalid ISBN	9781118950951
035 ## - SYSTEM CONTROL NUMBER
System control number	(OCoLC)891186025
037 ## - SOURCE OF ACQUISITION
Stock number	0C3F661A-3397-4AD2-8301-24CBBD5AAE9F
Source of stock number/acquisition	OverDrive, Inc.
Note	http://www.overdrive.com
040 ## - CATALOGING SOURCE
Original cataloging agency	DLC
Language of cataloging	eng
Description conventions	rda
--	pn
Transcribing agency	DLC
Modifying agency	N$T
--	YDXCP
--	E7B
--	OSU
--	DG1
--	OCLCF
--	COO
--	OCLCQ
--	RRP
--	TEFOD
--	OCLCQ
042 ## - AUTHENTICATION CODE
Authentication code	pcc
050 00 - LIBRARY OF CONGRESS CALL NUMBER
Classification number	QA76.9.D343
072 #7 - SUBJECT CATEGORY CODE
Subject category code	COM
Subject category code subdivision	000000
Source	bisacsh
100 1# - MAIN ENTRY--PERSONAL NAME
Personal name	Cichosz, Paweł,
Authority record control number or standard number	http://id.loc.gov/authorities/names/n2014057642
Relator term	author
245 10 - TITLE STATEMENT
Title	Data mining algorithms :
Remainder of title	explained using R /
Statement of responsibility, etc.	Pawel Cichosz
264 #1 - PRODUCTION, PUBLICATION, DISTRIBUTION, MANUFACTURE, AND COPYRIGHT NOTICE
Place of production, publication, distribution, manufacture	Chichester, West Sussex ;
--	Malden, MA :
Name of producer, publisher, distributor, manufacturer	John Wiley & Sons Inc.,
Date of production, publication, distribution, manufacture, or copyright notice	2015
300 ## - PHYSICAL DESCRIPTION
Extent	1 online resource (xxxi, 683 pages)
336 ## - CONTENT TYPE
Content type term	text
Content type code	txt
Source	rdacontent
337 ## - MEDIA TYPE
Media type term	computer
Media type code	c
Source	rdamedia
338 ## - CARRIER TYPE
Carrier type term	online resource
Carrier type code	cr
Source	rdacarrier
504 ## - BIBLIOGRAPHY, ETC. NOTE
Bibliography, etc. note
505 0# - FORMATTED CONTENTS NOTE
Formatted contents note	Machine generated contents note: pt. I Preliminaries -- 1. Tasks -- 1.1. Introduction -- 1.1.1. Knowledge -- 1.1.2. Inference -- 1.2. Inductive learning tasks -- 1.2.1. Domain -- 1.2.2. Instances -- 1.2.3. Attributes -- 1.2.4. Target attribute -- 1.2.5. Input attributes -- 1.2.6. Training set -- 1.2.7. Model -- 1.2.8. Performance -- 1.2.9. Generalization -- 1.2.10. Overfitting -- 1.2.11. Algorithms -- 1.2.12. Inductive learning as search -- 1.3. Classification -- 1.3.1. Concept -- 1.3.2. Training set -- 1.3.3. Model -- 1.3.4. Performance -- 1.3.5. Generalization -- 1.3.6. Overfitting -- 1.3.7. Algorithms -- 1.4. Regression -- 1.4.1. Target function -- 1.4.2. Training set -- 1.4.3. Model -- 1.4.4. Performance -- 1.4.5. Generalization -- 1.4.6. Overfitting -- 1.4.7. Algorithms -- 1.5. Clustering -- 1.5.1. Motivation -- 1.5.2. Training set -- 1.5.3. Model -- 1.5.4. Crisp vs. soft clustering -- 1.5.5. Hierarchical clustering -- 1.5.6. Performance -- 1.5.7. Generalization -- 1.5.8. Algorithms

Formatted contents note	1.5.9. Descriptive vs. predictive clustering -- 1.6. Practical issues -- 1.6.1. Incomplete data -- 1.6.2. Noisy data -- 1.7. Conclusion -- 1.8. Further readings -- References -- 2. Basic statistics -- 2.1. Introduction -- 2.2. Notational conventions -- 2.3. Basic statistics as modeling -- 2.4. Distribution description -- 2.4.1. Continuous attributes -- 2.4.2. Discrete attributes -- 2.4.3. Confidence intervals -- 2.4.4.m-Estimation -- 2.5. Relationship detection -- 2.5.1. Significance tests -- 2.5.2. Continuous attributes -- 2.5.3. Discrete attributes -- 2.5.4. Mixed attributes -- 2.5.5. Relationship detection caveats -- 2.6. Visualization -- 2.6.1. Boxplot -- 2.6.2. Histogram -- 2.6.3. Barplot -- 2.7. Conclusion -- 2.8. Further readings -- References -- pt. II Classification -- 3. Decision trees -- 3.1. Introduction -- 3.2. Decision tree model -- 3.2.1. Nodes and branches -- 3.2.2. Leaves -- 3.2.3. Split types -- 3.3. Growing -- 3.3.1. Algorithm outline

Formatted contents note	10.4. Conclusion -- 10.5. Further readings -- References -- pt. IV Clustering -- 11.(Dis)similarity measures -- 11.1. Introduction -- 11.2. Measuring dissimilarity and similarity -- 11.3. Difference-based dissimilarity -- 11.3.1. Euclidean distance -- 11.3.2. Minkowski distance -- 11.3.3. Manhattan distance -- 11.3.4. Canberra distance -- 11.3.5. Chebyshev distance -- 11.3.6. Hamming distance -- 11.3.7. Gower's coefficient -- 11.3.8. Attribute weighting -- 11.3.9. Attribute transformation -- 11.4. Correlation-based similarity -- 11.4.1. Discrete attributes -- 11.4.2. Pearson's correlation similarity -- 11.4.3. Spearman's correlation similarity -- 11.4.4. Cosine similarity -- 11.5. Missing attribute values -- 11.6. Conclusion -- 11.7. Further readings -- References -- 12.k-Centers clustering -- 12.1. Introduction -- 12.1.1. Basic principle -- 12.1.2.(Dis)similarity measures -- 12.2. Algorithm scheme -- 12.2.1. Initialization -- 12.2.2. Stop criteria -- 12.2.3. Cluster formation

Formatted contents note	12.2.4. Implicit cluster modeling -- 12.2.5. Instantiations -- 12.3.k-Means -- 12.3.1. Center adjustment -- 12.3.2. Minimizing dissimilarity to centers -- 12.4. Beyond means -- 12.4.1.k-Medians -- 12.4.2.k-Medoids -- 12.5. Beyond (fixed) k -- 12.5.1. Multiple runs -- 12.5.2. Adaptive k-centers -- 12.6. Explicit cluster modeling -- 12.7. Conclusion -- 12.8. Further readings -- References -- 13. Hierarchical clustering -- 13.1. Introduction -- 13.1.1. Basic approaches -- 13.1.2.(Dis)similarity measures -- 13.2. Cluster hierarchies -- 13.2.1. Motivation -- 13.2.2. Model representation -- 13.3. Agglomerative clustering -- 13.3.1. Algorithm scheme -- 13.3.2. Cluster linkage -- 13.4. Divisive clustering -- 13.4.1. Algorithm scheme -- 13.4.2. Wrapping a flat clustering algorithm -- 13.4.3. Stop criteria -- 13.5. Hierarchical clustering visualization -- 13.6. Hierarchical clustering prediction -- 13.6.1. Cutting cluster hierarchies -- 13.6.2. Cluster membership assignment

Formatted contents note	13.7. Conclusion -- 13.8. Further readings -- References -- 14. Clustering model evaluation -- 14.1. Introduction -- 14.1.1. Dataset performance -- 14.1.2. Training performance -- 14.1.3. True performance -- 14.2. Per-cluster quality measures -- 14.2.1. Diameter -- 14.2.2. Separation -- 14.2.3. Isolation -- 14.2.4. Silhouette width -- 14.2.5. Davies -- Bouldin Index -- 14.3. Overall quality measures -- 14.3.1. Dunn Index -- 14.3.2. Average Davies -- Bouldin Index -- 14.3.3.C Index -- 14.3.4. Average silhouette width -- 14.3.5. Loglikelihood -- 14.4. External quality measures -- 14.4.1. Misclassification error -- 14.4.2. Rand Index -- 14.4.3. General relationship detection measures -- 14.5. Using quality measures -- 14.6. Conclusion -- 14.7. Further readings -- References -- pt. V Getting Better Models -- 15. Model ensembles -- 15.1. Introduction -- 15.2. Model committees -- 15.3. Base models -- 15.3.1. Different training sets -- 15.3.2. Different algorithms

Formatted contents note	15.3.3. Different parameter setups -- 15.3.4. Algorithm randomization -- 15.3.5. Base model diversity -- 15.4. Model aggregation -- 15.4.1. Voting/Averaging -- 15.4.2. Probability averaging -- 15.4.3. Weighted voting/averaging -- 15.4.4. Using as attributes -- 15.5. Specific ensemble modeling algorithms -- 15.5.1. Bagging -- 15.5.2. Stacking -- 15.5.3. Boosting -- 15.5.4. Random forest -- 15.5.5. Random Naive Bayes -- 15.6. Quality of ensemble predictions -- 15.7. Conclusion -- 15.8. Further readings -- References -- 16. Kernel methods -- 16.1. Introduction -- 16.2. Support vector machines -- 16.2.1. Classification margin -- 16.2.2. Maximum-margin hyperplane -- 16.2.3. Primal form -- 16.2.4. Dual form -- 16.2.5. Soft margin -- 16.3. Support vector regression -- 16.3.1. Regression tube -- 16.3.2. Primal form -- 16.3.3. Dual form -- 16.4. Kernel trick -- 16.5. Kernel functions -- 16.5.1. Linear kernel -- 16.5.2. Polynomial kernel -- 16.5.3. Radial kernel -- 16.5.4. Sigmoid kernel

Formatted contents note	16.6. Kernel prediction -- 16.7. Kernel-based algorithms -- 16.7.1. Kernel-based SVM -- 16.7.2. Kernel-based SVR -- 16.8. Conclusion -- 16.9. Further readings -- References -- 17. Attribute transformation -- 17.1. Introduction -- 17.2. Attribute transformation task -- 17.2.1. Target task -- 17.2.2. Target attribute -- 17.2.3. Transformed attribute -- 17.2.4. Training set -- 17.2.5. Modeling transformations -- 17.2.6. Nonmodeling transformations -- 17.3. Simple transformations -- 17.3.1. Standardization -- 17.3.2. Normalization -- 17.3.3. Aggregation -- 17.3.4. Imputation -- 17.3.5. Binary encoding -- 17.4. Multiclass encoding -- 17.4.1. Encoding and decoding functions -- 17.4.2.1-ok-k encoding -- 17.4.3. Error-correcting encoding -- 17.4.4. Effects of multiclass encoding -- 17.5. Conclusion -- 17.6. Further readings -- References -- 18. Discretization -- 18.1. Introduction -- 18.2. Discretization task -- 18.2.1. Motivation -- 18.2.2. Task definition

Formatted contents note	18.2.3. Discretization as modeling -- 18.2.4. Discretization quality -- 18.3. Unsupervised discretization -- 18.3.1. Equal-width intervals -- 18.3.2. Equal-frequency intervals -- 18.3.3. Nonmodeling discretization -- 18.4. Supervised discretization -- 18.4.1. Pure-class discretization -- 18.4.2. Bottom-up discretization -- 18.4.3. Top-down discretization -- 18.5. Effects of discretization -- 18.6. Conclusion -- 18.7. Further readings -- References -- 19. Attribute selection -- 19.1. Introduction -- 19.2. Attribute selection task -- 19.2.1. Motivation -- 19.2.2. Task definition -- 19.2.3. Algorithms -- 19.3. Attribute subset search -- 19.3.1. Search task -- 19.3.2. Initial state -- 19.3.3. Search operators -- 19.3.4. State selection -- 19.3.5. Stop criteria -- 19.4. Attribute selection filters -- 19.4.1. Simple statistical filters -- 19.4.2. Correlation-based filters -- 19.4.3. Consistency-based filters -- 19.4.4. RELIEF -- 19.4.5. Random forest -- 19.4.6. Cutoff criteria

Formatted contents note	19.4.7. Filter-driven search -- 19.5. Attribute selection wrappers -- 19.5.1. Subset evaluation -- 19.5.2. Wrapper attribute selection -- 19.6. Effects of attribute selection -- 19.7. Conclusion -- 19.8. Further readings -- References -- 20. Case studies -- 20.1. Introduction -- 20.1.1. Datasets -- 20.1.2. Packages -- 20.1.3. Auxiliary functions -- 20.2. Census income -- 20.2.1. Data loading and preprocessing -- 20.2.2. Default model -- 20.2.3. Incorporating misclassification costs -- 20.2.4. Pruning -- 20.2.5. Attribute selection -- 20.2.6. Final models -- 20.3.Communities and crime -- 20.3.1. Data loading -- 20.3.2. Data quality -- 20.3.3. Regression trees -- 20.3.4. Linear models -- 20.3.5. Attribute selection -- 20.3.6. Piecewise-linear models -- 20.4. Cover type -- 20.4.1. Data loading and preprocessing -- 20.4.2. Class imbalance -- 20.4.3. Decision trees -- 20.4.4. Class rebalancing -- 20.4.5. Multiclass encoding -- 20.4.6. Final classification models -- 20.4.7. Clustering

Formatted contents note	20.5. Conclusion -- 20.6. Further readings -- References -- Closing -- A. Notation -- A.1. Attribute values -- A.2. Data subsets -- A.3. Probabilities -- B.R packages -- B.1. CRAN packages -- B.2. DMR packages -- B.3. Installing packages -- References -- C. Datasets

Formatted contents note	3.3.2. Class distribution calculation -- 3.3.3. Class label assignment -- 3.3.4. Stop criteria -- 3.3.5. Split selection -- 3.3.6. Split application -- 3.3.7.Complete process -- 3.4. Pruning -- 3.4.1. Pruning operators -- 3.4.2. Pruning criterion -- 3.4.3. Pruning control strategy -- 3.4.4. Conversion to rule sets -- 3.5. Prediction -- 3.5.1. Class label prediction -- 3.5.2. Class probability prediction -- 3.6. Weighted instances -- 3.7. Missing value handling -- 3.7.1. Fractional instances -- 3.7.2. Surrogate splits -- 3.8. Conclusion -- 3.9. Further readings -- References -- 4. Naive Bayes classifier -- 4.1. Introduction -- 4.2. Bayes rule -- 4.3. Classification by Bayesian inference -- 4.3.1. Conditional class probability -- 4.3.2. Prior class probability -- 4.3.3. Independence assumption -- 4.3.4. Conditional attribute value probabilities -- 4.3.5. Model construction -- 4.3.6. Prediction -- 4.4. Practical issues -- 4.4.1. Zero and small probabilities

Formatted contents note	4.4.2. Linear classification -- 4.4.3. Continuous attributes -- 4.4.4. Missing attribute values -- 4.4.5. Reducing naivety -- 4.5. Conclusion -- 4.6. Further readings -- References -- 5. Linear classification -- 5.1. Introduction -- 5.2. Linear representation -- 5.2.1. Inner representation function -- 5.2.2. Outer representation function -- 5.2.3. Threshold representation -- 5.2.4. Logit representation -- 5.3. Parameter estimation -- 5.3.1. Delta rule -- 5.3.2. Gradient descent -- 5.3.3. Distance to decision boundary -- 5.3.4. Least squares -- 5.4. Discrete attributes -- 5.5. Conclusion -- 5.6. Further readings -- References -- 6. Misclassification costs -- 6.1. Introduction -- 6.2. Cost representation -- 6.2.1. Cost matrix -- 6.2.2. Per-class cost vector -- 6.2.3. Instance-specific costs -- 6.3. Incorporating misclassification costs -- 6.3.1. Instance weighting -- 6.3.2. Instance resampling -- 6.3.3. Minimum-cost rule -- 6.3.4. Instance relabeling

Formatted contents note	6.4. Effects of cost incorporation -- 6.5. Experimental procedure -- 6.6. Conclusion -- 6.7. Further readings -- References -- 7. Classification model evaluation -- 7.1. Introduction -- 7.1.1. Dataset performance -- 7.1.2. Training performance -- 7.1.3. True performance -- 7.2. Performance measures -- 7.2.1. Misclassification error -- 7.2.2. Weighted misclassification error -- 7.2.3. Mean misclassification cost -- 7.2.4. Confusion matrix -- 7.2.5. ROC analysis -- 7.2.6. Probabilistic performance measures -- 7.3. Evaluation procedures -- 7.3.1. Model evaluation vs. modeling procedure evaluation -- 7.3.2. Evaluation caveats -- 7.3.3. Hold-out -- 7.3.4. Cross-validation -- 7.3.5. Leave-one-out -- 7.3.6. Bootstrapping -- 7.3.7. Choosing the right procedure -- 7.3.8. Evaluation procedures for temporal data -- 7.4. Conclusion -- 7.5. Further readings -- References -- pt. III Regression -- 8. Linear regression -- 8.1. Introduction -- 8.2. Linear representation

Formatted contents note	8.2.1. Parametric representation -- 8.2.2. Linear representation function -- 8.2.3. Nonlinear representation functions -- 8.3. Parameter estimation -- 8.3.1. Mean square error minimization -- 8.3.2. Delta rule -- 8.3.3. Gradient descent -- 8.3.4. Least squares -- 8.4. Discrete attributes -- 8.5. Advantages of linear models -- 8.6. Beyond linearity -- 8.6.1. Generalized linear representation -- 8.6.2. Enhanced representation -- 8.6.3. Polynomial regression -- 8.6.4. Piecewise-linear regression -- 8.7. Conclusion -- 8.8. Further readings -- References -- 9. Regression trees -- 9.1. Introduction -- 9.2. Regression tree model -- 9.2.1. Nodes and branches -- 9.2.2. Leaves -- 9.2.3. Split types -- 9.2.4. Piecewise-constant regression -- 9.3. Growing -- 9.3.1. Algorithm outline -- 9.3.2. Target function summary statistics -- 9.3.3. Target value assignment -- 9.3.4. Stop criteria -- 9.3.5. Split selection -- 9.3.6. Split application -- 9.3.7.Complete process -- 9.4. Pruning

Formatted contents note	9.4.1. Pruning operators -- 9.4.2. Pruning criterion -- 9.4.3. Pruning control strategy -- 9.5. Prediction -- 9.6. Weighted instances -- 9.7. Missing value handling -- 9.7.1. Fractional instances -- 9.7.2. Surrogate splits -- 9.8. Piecewise linear regression -- 9.8.1. Growing -- 9.8.2. Pruning -- 9.8.3. Prediction -- 9.9. Conclusion -- 9.10. Further readings -- References -- 10. Regression model evaluation -- 10.1. Introduction -- 10.1.1. Dataset performance -- 10.1.2. Training performance -- 10.1.3. True performance -- 10.2. Performance measures -- 10.2.1. Residuals -- 10.2.2. Mean absolute error -- 10.2.3. Mean square error -- 10.2.4. Root mean square error -- 10.2.5. Relative absolute error -- 10.2.6. Coefficient of determination -- 10.2.7. Correlation -- 10.2.8. Weighted performance measures -- 10.2.9. Loss functions -- 10.3. Evaluation procedures -- 10.3.1. Hold-out -- 10.3.2. Cross-validation -- 10.3.3. Leave-one-out -- 10.3.4. Bootstrapping -- 10.3.5. Choosing the right procedure
506 ## - RESTRICTIONS ON ACCESS NOTE
Terms governing access	Available to OhioLINK libraries
520 ## - SUMMARY, ETC.
Summary, etc.	"This book narrows down the scope of data mining by adopting a heavily modeling-oriented perspective"--
Assigning source	Provided by publisher
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name entry element	Computer algorithms.
Authority record control number or standard number	http://id.loc.gov/authorities/subjects/sh91000149
9 (RLIN)	534

Topical term or geographic name entry element	Data mining.
Authority record control number or standard number	http://id.loc.gov/authorities/subjects/sh97002073
9 (RLIN)	6146

Topical term or geographic name entry element	R (Computer program language)
Authority record control number or standard number	http://id.loc.gov/authorities/subjects/sh2002004407
9 (RLIN)	25812
655 #4 - INDEX TERM--GENRE/FORM
Genre/form data or focus term	Electronic books
9 (RLIN)	2032
710 2# - ADDED ENTRY--CORPORATE NAME
Corporate name or jurisdiction name as entry element	Ohio Library and Information Network.
Authority record control number or standard number	http://id.loc.gov/authorities/names/no95058981
776 08 - ADDITIONAL PHYSICAL FORM ENTRY
Relationship information	Print version:
Main entry heading	Cichosz, Pawel.
Title	Data mining algorithms.
Place, publisher, and date of publication	Chichester, West Sussex, United Kingdom : Wiley, 2015
International Standard Book Number	9781118332580
Record control number	(DLC) 2014036992
--	(OCoLC)890971737
856 40 - ELECTRONIC LOCATION AND ACCESS
Materials specified	OhioLINK
Public note	Connect to resource
Uniform Resource Identifier	<a href="https://rave.ohiolink.edu/ebooks/ebc2/9781118950951">https://rave.ohiolink.edu/ebooks/ebc2/9781118950951</a>

Materials specified	Wiley Online Library
Public note	Connect to resource (off-campus)
Uniform Resource Identifier	<a href="https://go.ohiolink.edu/goto?url=https://onlinelibrary.wiley.com/doi/book/10.1002/9781118950951">https://go.ohiolink.edu/goto?url=https://onlinelibrary.wiley.com/doi/book/10.1002/9781118950951</a>

Materials specified	Wiley Online Library
Public note	Connect to resource
Uniform Resource Identifier	<a href="https://onlinelibrary.wiley.com/doi/book/10.1002/9781118950951">https://onlinelibrary.wiley.com/doi/book/10.1002/9781118950951</a>

Materials specified	O'Reilly
Public note	Connect to resource
Uniform Resource Identifier	<a href="https://learning.oreilly.com/library/view/~/9781118950807/?ar">https://learning.oreilly.com/library/view/~/9781118950807/?ar</a>

Holdings
Withdrawn status	Lost status	Source of classification or shelving scheme	Not for loan	Collection code	Home library	Current library	Date acquired	Source of acquisition	Cost, normal purchase price	Inventory number	Barcode	Date last seen	Cost, replacement price	Date shelved	Koha item type
		Library of Congress Classification	Geçerli değil-e-Kitap / Not applicable-e-Book	E-Kitap Koleksiyonu	Merkez Kütüphane	Merkez Kütüphane	30/12/2024	Satın Alma / Purchase	0.00	GİT	EBK03662	30/12/2024	0.00	30/12/2024	E-Book