Spark机器学习 (英)彭特里思(Nick Pentreath) 著 kindle 下载网盘 pdf azw3 极速 rtf umd-白云书房

免费下载书籍地址：PDF下载地址

精美图片

Spark机器学习 (英)彭特里思(Nick Pentreath) 著书籍详细信息

ISBN：9787564160913
作者：暂无作者
出版社：暂无出版社
出版时间：2016-01
页数：319
价格：43.50
纸张：轻型纸
装帧：平装-胶订
开本：16开
语言：未知
丛书：暂无丛书
TAG：暂无
豆瓣评分：暂无豆瓣评分

寄语：

新华书店正版，关注店铺成为会员可享店铺专属优惠，团购客户请咨询在线客服！

内容简介：

你可以从书中学到使用Scala、Java和Python创建你的靠前个Spark程序；在你自己的计算机以及AmazonEC2上建立、配置Spark开发环境；访问公共机器学习数据集，使用Spark载入、处理、清理、转换数据；使用Spark的机器学习库来实现能够利用各种熟知的机器学习模型的程序；等等。

书籍目录：

Preface

Chapter 1： Getting Up and Running with Spark

Installing and setting up Spark locally

Spark clusters

The Spark programming model

Spark Context and Spark Conf

The Spark shell

Resilient Distributed Datasets

Creating RDDs

Spark operations

Caching RDDs

Broadcast variables and accumulators

The first step to a Spark program in Scala

The first step to a Spark program in Java

The first step to a Spark program in Python

Getting Spark running on Amazon EC2

Launching an EC2 Spark cluster

Summary

Chapter 2： Designing a Machine Learning System

Introducing Movie Stream

Business use cases for a machine learning system

Personalization

Targeted marketing and customer segmentation

Predictive modeling and analytics

Types of machine learning models

The components of a data—driven machine learning system

Data ingestion and storage

Data cleansing and transformation

Model training and testing loop

Model deployment and integration

Model monitoring and feedback

Batch versus real time

An architecture for a machine learning system

Practical exercise

Summary

Chapter 3： Obtaining， Processing， and Preparing Data with Spark

Accessing publicly available datasets

The Movie Lens lOOk dataset

Exploring and visualizing your data

Exploring the user dataset

Exploring the movie dataset

Exploring the rating dataset

Processing and transforming your data

Filling in bad or missing data

Extracting useful features from your data

Numerical features

Categorical features

Derived features

Transforming timestamps into categorical features

Text features

Simple text feature extraction

Normalizing features

Using MLlib for feature normalization

Using packages for feature extraction

Summary

Chapter 4： Building a Recommendation Engine with Spark

Types of recommendation models

Content—based filtering

Collaborative filtering

Matrix factorization

Extracting the right features from your data

Extracting features from the MovieLens 100k dataset

Training the recommendation model

Training a model on the MovieLens 100k dataset

Training a model using implicit feedback data

Using the recommendation model

User recommendations

Generating movie recommendations from the MovieLens 100k dataset

Item recommendations

Generating similar movies for the MovieLens 100k dataset

Evaluating the performance of recommendation models

Mean Squared Error

Mean average precision at K

Using MLlib's built—in evaluation functions

RMSE and MSE

MAP

Summary

Chapter 5： Building a Classification Model with Spark

Types of classification models

Linear models

Logistic regression

Linear support vector machines

The na'fve Bayes model

Decision trees

Extracting the right features from your data

Extracting features from the Kaggle／StumbleUpon evergreen classification dataset

Training classification models

Training a classification model on the Kaggle／StumbleUpon evergreen classification dataset

Using classification models

Generating predictions for the Kaggle／StumbleUpon

evergreen classification dataset

Evaluating the performance of classification models

Accuracy and prediction error

Precision and recall

ROC curve and AUC

Improving model performance and tuning parameters

Feature standardization

Additional features

Using the correct form of data

Tuning model parameters

Linear models

Decision trees

The naive Bayes model

Cross—validation

Summary

Chapter 6： Buildin a Regression Model with Spark

Types of regression models

Least squares regression

Decision trees for regression

Extracting the right features from your data

Extracting features from the bike sharing dataset

Creating feature vectors for the linear model

Creating feature vectors for the decision tree

Training and using regression models

Training a regression model on the bike sharing dataset

Evaluating the performance of regression models

Mean Squared Error and Root Mean Squared Error

Mean Absolute Error

Root Mean Squared Log Error

The R—squared coefficient

Computing performance metrics on the bike sharing dataset

Linear model

Decision tree

Improving model performance and tuning parameters

Transforming the target variable

Impact of training on log—transformed targets

Tuning model parameters

Creating training and testing sets to evaluate parameters

The impact of parameter settings for linear models

The impact of parameter settings for the decision tree

Summary

Chapter 7： Building a Clustering Model with Spark

Types of clustering models

K—means clustering

Initialization methods

Variants

Mixture models

Hierarchical clustering

Extracting the right features from your data

Extracting features from the MovieLens dataset

Extracting movie genre labels

Training the recommendation model

Normalization

Training a clustering model

Training a clustering model on the MovieLens dataset

Making predictions using a clustering model

Interpreting cluster predictions on the MovieLens dataset

Interpreting the movie clusters

Evaluating the performance of clustering models

Internal evaluation metrics

External evaluation metrics

Computing performance metrics on the MovieLens dataset

Tuning parameters for clustering models

Selecting K through cross—validation

Summary

Chapter 8： Dimensionality Reduction with Spark

Types of dimensionality reduction

Principal Components Analysis

Singular Value Decomposition

Relationship with matrix factorization

Clustering as dimensionality reduction

Extracting the right features from your data

Extracting features from the LFW dataset

Exploring the face data

Visualizing the face data

Extracting facial images as vectors

Normalization

Training a dimensionality reduction model

Running PCA on the LFW dataset

Visualizing the Eigenfaces

Interpreting the Eigenfaces

Using a dimensionality reduction model

Projecting data using PCA on the LFW dataset

The relationship between PCA and SVD

Evaluating dimensionality reduction models

Evaluating k for SVD on the LFW dataset

Summary

Chapter 9： Advanced Text Processing with Spark

What's so special about text data？

Extracting the right features from your data

Term weighting schemes

Feature hashing

Extracting the TF—IDF features from the 20 Newsgroups dataset

Exploring the 20 Newsgroups data

Applying basic tokenization

Improving our tokenization

Removing stop words

Excluding terms based on frequency

A note about stemming

Training a TF—IDF model

Analyzing the TF—IDF weightings

Using a TF—IDF model

Document similarity with the 20 Newsgroups dataset and

TF—IDF features

Training a text classifier on the 20 Newsgroups dataset

using TF—IDF

Evaluating the impact of text processing

Comparing raw features with processed TF—IDF features on the

20 Newsgroups dataset

Word2Vec models

Word2Vec on the 20 Newsgroups dataset

Summary

Chapter 10： Real—time Machine Learning withSpark Streaming

Online learning

Stream processing

An introduction to Spark Streaming

Input sources

Transformations

Actions

Window operators

Caching and fault tolerance with Spark Streaming

Creating a Spark Streaming application

The producer application

Creating a basic streaming application

Streaming analytics

Stateful streaming

Online learning with Spark Streaming

Streaming regression

A simple streaming regression program

Creating a streaming data producer

Creating a streaming regression model

Streaming K—means

Online model evaluation

Comparing model performance with Spark Streaming

Summary

Index

作者介绍：

彭特里思，如果你是一名Scala、Java或Python开发人员，对机器学习和数据分析饶有兴趣，并热衷于学习如何使用spa rk框架将常见机器学习技术运用干大规模应用，那么这本书就是写给你的。如果对spark有基本的理解自然会有益处，但这并不是必需的。

出版社信息：

暂无出版社相关信息，正在全力查找中！

书籍摘录：

暂无相关书籍摘录，正在全力查找中！

在线阅读/听书/购买/PDF下载地址：

在线阅读地址：Spark机器学习 (英)彭特里思(Nick Pentreath) 著在线阅读

在线听书地址：Spark机器学习 (英)彭特里思(Nick Pentreath) 著在线收听

在线购买地址：Spark机器学习 (英)彭特里思(Nick Pentreath) 著在线购买

原文赏析：

在信息检索中，准确率通常用于评价结果的质量，而召回率用来评价结果的完整性。

通常，准确率和召回率是负相关的，高准确率常常对应低召回率，反之亦然。

准确率和召回率在单独度量时用处不大，但是它们通常会被一起组成聚合或者平均度量。二者也同时依赖于模型中选择的阈值。

现代的大数据场景包含如下需求：比如能与系统的其他组件整合，尤其是数据的收集和存储系统、分析和报告以及前端应用；易于扩展且与其他组件相对独立..；.. 最好能同时支持批处理和实时处理。

个性化和推荐十分相似，但推荐通常专指向用户显式地呈现某些产品或是内容，而个性化有时偏向隐式。比如说，对 MovieStream 的搜索功能个性化，以根据该用户的数据来改变搜索结果。

对数据进行初步预处理之后，需要将其转换为一种适合机器学习模型的表示形式。对许多模型类型来说，这种表示就是包含数值数据的向量或矩阵。

其它内容：

书籍介绍

Apache spark是一款全新开发的分布式框架，特别对低延迟任务和内存数据存储进行了优化。它结合了速度、可扩展性、内存处理以及容错性，是极少数适用于并行计算的框架之一，同时还非常易于编程，拥有一套灵活、表达能力丰富、功能强大的API设计。

《Spark机器学习（影印版英文版）》指导你学习用于载入及处理数据的spark APl的基础知识，以及如何为各种机器学习模型准备适合的输入数据：另有详细的例子和实际生活中的真实案例来帮助你学习包括推荐系统、分类、回归、聚类、降维在内的常见机器学习模型，你还会看到如大规模文本处理之类的高级主题、在线机器学习的相关方法以及使用spa rk st reami ng进行模型评估。

书籍真实打分

故事情节：8分

人物塑造：3分

主题深度：8分

文字风格：4分

语言运用：9分

文笔流畅：4分

思想传递：3分

知识深度：3分

知识广度：7分

实用性：3分

章节划分：9分

结构布局：6分

新颖与独特：5分

情感共鸣：7分

引人入胜：7分

现实相关：5分

沉浸感：9分

事实准确性：7分

文化贡献：6分

网站评分

书籍多样性：4分

书籍信息完全性：6分

网站更新速度：3分

使用便利性：7分

书籍清晰度：9分

书籍格式兼容性：3分

是否包含广告：6分

加载速度：6分

安全性：6分

稳定性：8分

搜索功能：6分

下载便捷性：8分

下载点评

epub(358+)
体验还行(496+)
无缺页(566+)
差评(404+)
引人入胜(307+)
差评少(101+)
排版满分(232+)
无多页(125+)
一般般(666+)
图文清晰(236+)
下载快(625+)
愉快的找书体验(451+)

下载评价

网友车***波：很好，下载出来的内容没有乱码。

网友屠***好：还行吧。

网友石***致：挺实用的，给个赞！希望越来越好，一直支持。

网友谢***灵：推荐，啥格式都有

网友郗***兰：网站体验不错

网友石***烟：还可以吧，毕竟也是要成本的，付费应该的，更何况下载速度还挺快的

网友仰***兰：喜欢！很棒！！超级推荐！

网友游***钰：用了才知道好用，推荐！太好用了

网友邱***洋：不错，支持的格式很多

网友陈***秋：不错，图文清晰，无错版，可以入手。

网友孙***夏：中评，比上不足比下有余

网友方***旋：真的很好，里面很多小说都能搜到，但就是收费的太多了

网友芮***枫：有点意思的网站，赞一个真心好好好哈哈

网友宫***玉：我说完了。

网友后***之：强烈推荐！无论下载速度还是书籍内容都没话说真的很良心！

网友权***波：收费就是好，还可以多种搜索，实在不行直接留言，24小时没发到你邮箱自动退款的！

免费下载书籍地址：PDF下载地址

精美图片

Spark机器学习 (英)彭特里思(Nick Pentreath) 著书籍详细信息

寄语：

内容简介：

书籍目录：

作者介绍：

出版社信息：

书籍摘录：

在线阅读/听书/购买/PDF下载地址：

原文赏析：

其它内容：

书籍真实打分

网站评分

下载点评

下载评价

相关文章：