Table of Contents

About

A range request is simply a way for http to request a portion of the file instead of the entire file. It gives the possibility to read a file in multiple portion and therefore in parallel.

Example

Example with a GET method

GET https://datacadamia.com/path/to/file HTTP/1.1
Range: byte=0-10000

Usage

EMR uses this technique to read data from S3. For example, if a single data file on Amazon S3 is about 1 GB, Hadoop reads your file from Amazon S3 by issuing 15 different HTTP requests in parallel if Amazon S3 split size is 64 MB (1 GB/64 MB = ~15).

Documentation / Reference