In this paper, the authors proposed an Availability-aware data placemente (ADAPT) strategy to improve the application performance without extra storage cost. The objective of the data placement algorithm is to find an optimized mapping from data blocks to the nodes, such that all nodes complete their assigned blocks at the same time. They propose an analytical model to estimate the execution time of MapReduce tasks under non-dedicated distributed computing environments. This way, they can mitigate the impact of volatility and heterogeneity of the nodes. ADAPT dynamically dispatches data blocks onto participating hosts based on their availabilities.
ADAPT was implemented within Hadoop MapReduce platform and incurs minor overheads to the existing Hadoop framework.
They perform extensive experiments and simulations to evaluate the feasibility and payoffs of ADAPT. The experimental results show that ADAPT improves application performance by more than 30%.
No comments:
Post a Comment