Simplify Big Data Analytics with Amazon EMR
內容簡介
Design scalable big data solutions using Hadoop, Spark, and AWS cloud native services
Key Features: Build data pipelines that require distributed processing capabilities on a large volume of dataDiscover the security features of EMR such as data protection and granular permission managementExplore best practices and optimization techniques for building data analytics solutions in Amazon EMR
Book Description:
Amazon EMR, formerly Amazon Elastic MapReduce, provides a managed Hadoop cluster in Amazon Web Services (AWS) that you can use to implement batch or streaming data pipelines. By gaining expertise in Amazon EMR, you can design and implement data analytics pipelines with persistent or transient EMR clusters in AWS.
This book is a practical guide to Amazon EMR for building data pipelines. You'll start by understanding the Amazon EMR architecture, cluster nodes, features, and deployment options, along with their pricing. Next, the book covers the various big data applications that EMR supports. You'll then focus on the advanced configuration of EMR applications, hardware, networking, security, troubleshooting, logging, and the different SDKs and APIs it provides. Later chapters will show you how to implement common Amazon EMR use cases, including batch ETL with Spark, real-time streaming with Spark Streaming, and handling UPSERT in S3 Data Lake with Apache Hudi. Finally, you'll orchestrate your EMR jobs and strategize on-premises Hadoop cluster migration to EMR. In addition to this, you'll explore best practices and cost optimization techniques while implementing your data analytics pipeline in EMR.
By the end of this book, you'll be able to build and deploy Hadoop- or Spark-based apps on Amazon EMR and also migrate your existing on-premises Hadoop workloads to AWS.
What You Will Learn: Explore Amazon EMR features, architecture, Hadoop interfaces, and EMR StudioConfigure, deploy, and orchestrate Hadoop or Spark jobs in productionImplement the security, data governance, and monitoring capabilities of EMRBuild applications for batch and real-time streaming data analytics solutionsPerform interactive development with a persistent EMR cluster and NotebookOrchestrate an EMR Spark job using AWS Step Functions and Apache Airflow
Who this book is for:
This book is for data engineers, data analysts, data scientists, and solution architects who are interested in building data analytics solutions with the Hadoop ecosystem services and Amazon EMR. Prior experience in either Python programming, Scala, or the Java programming language and a basic understanding of Hadoop and AWS will help you make the most out of this book.
配送方式
-
台灣
- 國內宅配:本島、離島
-
到店取貨:
不限金額免運費
-
海外
- 國際快遞:全球
-
港澳店取:
訂購/退換貨須知
加入金石堂 LINE 官方帳號『完成綁定』,隨時掌握出貨動態:
商品運送說明:
- 本公司所提供的產品配送區域範圍目前僅限台灣本島。注意!收件地址請勿為郵政信箱。
- 商品將由廠商透過貨運或是郵局寄送。消費者訂購之商品若無法送達,經電話或 E-mail無法聯繫逾三天者,本公司將取消該筆訂單,並且全額退款。
- 當廠商出貨後,您會收到E-mail出貨通知,您也可透過【訂單查詢】確認出貨情況。
- 產品顏色可能會因網頁呈現與拍攝關係產生色差,圖片僅供參考,商品依實際供貨樣式為準。
- 如果是大型商品(如:傢俱、床墊、家電、運動器材等)及需安裝商品,請依商品頁面說明為主。訂單完成收款確認後,出貨廠商將會和您聯繫確認相關配送等細節。
- 偏遠地區、樓層費及其它加價費用,皆由廠商於約定配送時一併告知,廠商將保留出貨與否的權利。
提醒您!!
金石堂及銀行均不會請您操作ATM! 如接獲電話要求您前往ATM提款機,請不要聽從指示,以免受騙上當!
退換貨須知:
**提醒您,鑑賞期不等於試用期,退回商品須為全新狀態**
-
依據「消費者保護法」第19條及行政院消費者保護處公告之「通訊交易解除權合理例外情事適用準則」,以下商品購買後,除商品本身有瑕疵外,將不提供7天的猶豫期:
- 易於腐敗、保存期限較短或解約時即將逾期。(如:生鮮食品)
- 依消費者要求所為之客製化給付。(客製化商品)
- 報紙、期刊或雜誌。(含MOOK、外文雜誌)
- 經消費者拆封之影音商品或電腦軟體。
- 非以有形媒介提供之數位內容或一經提供即為完成之線上服務,經消費者事先同意始提供。(如:電子書、電子雜誌、下載版軟體、虛擬商品…等)
- 已拆封之個人衛生用品。(如:內衣褲、刮鬍刀、除毛刀…等)
- 若非上列種類商品,均享有到貨7天的猶豫期(含例假日)。
- 辦理退換貨時,商品(組合商品恕無法接受單獨退貨)必須是您收到商品時的原始狀態(包含商品本體、配件、贈品、保證書、所有附隨資料文件及原廠內外包裝…等),請勿直接使用原廠包裝寄送,或於原廠包裝上黏貼紙張或書寫文字。
- 退回商品若無法回復原狀,將請您負擔回復原狀所需費用,嚴重時將影響您的退貨權益。




商品評價