AmazonEC2 - Hadoop Wiki
If you run Hadoop on EC2 you might consider using AmazonS3 for accessing job data (data transfer to and from S3 from EC2 instances is free). Initial input can be read from S3 when a cluster is launched, and the final output can be written back to S3 before the cluster is decomissioned. Intermediate, temporary data, only needed between MapReduce passes, is more efficiently stored in Hadoop’s DFS. See AmazonS3 for more details.