版权说明 操作指南
首页 > 成果 > 详情

An Optimal Reduce Placement Algorithm for Data Skew Based on Sampling

认领
导出
下载 Link by DOI
反馈
分享
QQ微信 微博
成果类型:
会议论文
作者:
An Optimal Reduce Placement Algorithm for Data Skew Based on Sampling
作者机构:
湖南大学
语种:
英文
关键词:
Data sampling;Data skew;Inner communication;MapReduce;Reduce placement
期刊:
Workshop on Big Data Benchmarks
年:
2015
会议名称:
Big Data Benchmarks, Performance Optimization and Emerging Hardware. 6th Workshop, BPOE 2015
会议时间:
2015-08-31至2015-09-04
会议地点:
Kohala, HI, USA
摘要:
For frequent disk I/O and big data transmissions among different racks and physical nodes, the intermediate data communication has become the biggest performance bottle-neck in most running Hadoop systems. This paper proposes a reduce placement algorithm called CORP to schedule related map and reduce tasks on the near nodes or clusters or racks for the data locality. Since the number of keys cannot be counted until the input data are processed by map tasks, this paper firstly provides a sampling algorithm based on reservoir sampling to achieve the distribution of the keys in intermediate data....

反馈

验证码:
看不清楚,换一个
确定
取消

成果认领

标题:
用户 作者 通讯作者
请选择
请选择
确定
取消

提示

该栏目需要登录且有访问权限才可以访问

如果您有访问权限,请直接 登录访问

如果您没有访问权限,请联系管理员申请开通

管理员联系邮箱:yun@hnwdkj.com