加速 Apache Spark™ 3

利用 NVIDIA GPU 驅動次世代的資料分析和人工智慧

GPU-accelerated libraries, DataFrame and APIs:

Layered on top of NVIDIA CUDA, RAPIDS is a suite of open-source software libraries and APIs that provide GPU parallelism and high-bandwidth memory speed through DataFrame and graph operations, achieving speedup factors of 50x or more on typical end-to-end data science workflows. For Spark 3.0, new RAPIDS APIs are used by Spark SQL and DataFrames for GPU accelerated memory efficient columnar data processing and query plans.
With Spark 3.0 the Catalyst query optimizer has been modified to identify operators within a query plan that can be accelerated with the RAPIDS API, and to schedule those operators on GPUs within the Spark cluster, when executing the query plan.
A new Spark shuffle implementation, built upon GPU accelerated communication libraries including Remote direct memory access (RDMA), dramatically reduces the data transfer among Spark processes. RDMA allows GPUs to communicate directly with each other, across nodes, at up to 100Gb/s, operating as if on one massive server.

GPU-aware Scheduling in Spark

Spark 3.0 adds integration with the cluster managers (YARN, Kubernetes, and Standalone) to request GPUs, and plugin points to allow it to be extended to run operations on the GPU. This makes GPUs easier to request and use for Spark application developers, allows for closer integration with deep learning and AI frameworks such as Horovod and TensorFlow on Spark, and allows for better utilization of GPUs.

Apache Spark™ 強大的執行引擎可跨機器叢集進行大規模的平行資料處理，進而達成快速的應用程式開發和高效能。Spark 3 帶來大幅度的改良，讓你可以使用 GPU 的大規模平行架構進一步加速 Spark 資料處理。

透過此電子書了解 Spark 3 的創新技術如何使用大規模的 GPU 平行架構來進一步加速 Spark 的資料處理功能。

請填寫下方表格以下載電子書並了解下列內容：

從 Hadoop 到 GPU，以及 NVIDIA RAPIDS™ 的資料處理發展
Spark 的簡介及功能，並了解其重要性
Spark 的 GPU 加速功能
DataFrames 和 Spark SQL
具 Random Forest 分類功能的 Spark 迴歸範例
XGBoost 加速 GPU 端對端機器學習工作流程範例

立即下載

Section

Section

名

姓

公司電子郵件

組織 / 大學名稱

產業

職稱

地區

偏好語言

enterpriseOptIns hidden field

developerOptIns hidden field

我想收到 NVIDIA 企業業務解決方案的最新消息、公告與更多資訊。

我想收到 NVIDIA 開發技術與工具的最新消息、公告與更多資訊。

我想收到 NVIDIA 的企業業務解決方案及開發技術與工具最新消息、公告與更多資訊。

我願意收到下列有關 NVIDIA 的最新消息與公告：

企業業務解決方案

開發人員技術和工具

(非必選) 您可以隨時取消訂閱。

NVIDIA 隱私權政策

我同意 NVIDIA <span class="corporation-txt hidden">Corporation </span>收集和处理上述信息,以进行研究和活动组织,我已阅读并同意 <a href="https://www.nvidia.com/zh-tw/about-nvidia/privacy-policy/?deeplink=visiting-our-website" target="_blank">隐私政策</a>。

我同意,由于研究、活动组织和NVIDIA内部管理及系统运行需要,以上信息将被转移到美国的NVIDIA公司,并以符合 <a href="https://www.nvidia.com/zh-tw/about-nvidia/privacy-policy/?deeplink=visiting-our-website" target="_blank">隐私政策</a> 的方式存储。您可以通过发送电子邮件至 <a href="mailto:privacy@nvidia.com">privacy@nvidia.com</a> 联系我们,以解决相关问题。