checkpoints
checkpoints
Sequence File I/O
Sequence File I/O
Extract a Double from a String.
Extract a Double from a String.
Extract a Float from a String.
Extract a Float from a String.
Extract a Long from a String.
Extract a Long from a String.
Extract an Int from a String.
Extract an Int from a String.
Avro I/O
Avro I/O
INPUT CONVERTERS
INPUT CONVERTERS
Create a new DList from the contents of a list of one or more Avro files.
Create a new DList from the contents of a list of one or more Avro files. The type of the DList must conform to the schema types allowed by Avro, as constrained by the 'AvroSchema' type class. In the case of a directory being specified, the input forms all the files in that directory.
Create a new DList from the contents of one or more Avro files.
Create a new DList from the contents of one or more Avro files. The type of the DList must conform to the schema types allowed by Avro, as constrained by the 'AvroSchema' type class. In the case of a directory being specified, the input forms all the files in that directory.
Create a distributed list from a text file that is a number of fields delimited by some separator.
Create a distributed list from a text file that is a number of fields delimited by some separator. Use an extractor function to pull out the required fields to create the distributed list.
Create a distributed list from a list of one ore more files or directories (in the case of a directory, the input forms all files in that directory).
Create a distributed list from a list of one ore more files or directories (in the case of a directory, the input forms all files in that directory). The file(s) contain a number of fields delimited by a separator. Use an extractor function to pull out the required fields to create the distributed list.
create a DList from a stream of elements which will only be evaluated on the cluster
create a DList from a stream of elements which will only be evaluated on the cluster
Create a new DList from the contents of a list of one or more Sequence Files.
Create a new DList from the contents of a list of one or more Sequence Files. Note that the type parameters K and V are the "converted" Scala types for the Writable key-value types that must be contained in the the Sequence Files. In the case of a directory being specified, the input forms all the files in that directory.
Create a new DList from the contents of a list of one or more Sequence Files.
Create a new DList from the contents of a list of one or more Sequence Files. Note that the type parameters K and V are the "converted" Scala types for the Writable key-value types that must be contained in the the Sequence Files. In the case of a directory being specified, the input forms all the files in that directory.
SOURCES
SOURCES
Create a DList from one or more files or directories (in the case of a directory, the input forms all files in that directory).
Create a DList from one or more files or directories (in the case of a directory, the input forms all files in that directory).
Create a distributed list from one or more files or directories (in the case of a directory, the input forms all files in that directory).
Create a distributed list from one or more files or directories (in the case of a directory, the input forms all files in that directory). The distributed list is a tuple where the first part is the path of the originating file and the second part is a line of text.
Create a distributed list from a list of one or more files or directories (in the case of a directory, the input forms all files in that directory).
Create a distributed list from a list of one or more files or directories (in the case of a directory, the input forms all files in that directory). The distributed list is a tuple where the first part is the path of the originating file and the second part is a line of text.
TEXT SOURCES
TEXT SOURCES
Create a new DList from the "key" contents of a list of one or more Sequence Files.
Create a new DList from the "key" contents of a list of one or more Sequence Files. Note that the type parameter K is the "converted" Scala type for the Writable key type that must be contained in the the Sequence Files. In the case of a directory being specified, the input forms all the files in that directory.
Create a new DList from the "key" contents of one or more Sequence Files.
Create a new DList from the "key" contents of one or more Sequence Files. Note that the type parameter K is the "converted" Scala type for the Writable key type that must be contained in the the Sequence Files. In the case of a directory being specified, the input forms all the files in that directory.
Create a new DList from the "key" contents of a list of one or more Sequence Files.
Create a new DList from the "key" contents of a list of one or more Sequence Files. Note that the type parameter K is the "converted" Scala type for the Writable key type that must be contained in the the Sequence Files. In the case of a directory being specified, the input forms all the files in that directory. The distributed list is a tuple where the first part is the path of the originating file and the second part is the "key".
Partitioned sinks
Partitioned sinks
create a DObject which will only be evaluated on the cluster
create a DObject which will only be evaluated on the cluster
Persist a distributed lists of 'Products' (e.g.
Persist a distributed lists of 'Products' (e.g. Tuples) as a delimited text file.
Text file I/O
Text file I/O
Persist a distributed object of 'Products' (e.g.
Persist a distributed object of 'Products' (e.g. Tuples) as a delimited text file.
SINKS
SINKS
create a text source
create a text source
Create a new DList from the "value" contents of a list of one or more Sequence Files.
Create a new DList from the "value" contents of a list of one or more Sequence Files. Note that the type parameter V is the "converted" Scala type for the Writable value type that must be contained in the the Sequence Files. In the case of a directory being specified, the input forms all the files in that directory.
Create a new DList from the "value" contents of one or more Sequence Files.
Create a new DList from the "value" contents of one or more Sequence Files. Note that the type parameter V is the "converted" Scala type for the Writable value type that must be contained in the the Sequence Files. In the case of a directory being specified, the input forms all the files in that directory.
Create a new DList from the "value" contents of a list of one or more Sequence Files.
Create a new DList from the "value" contents of a list of one or more Sequence Files. Note that the type parameter V is the "converted" Scala type for the Writable value type that must be contained in the the Sequence Files. In the case of a directory being specified, the input forms all the files in that directory. The distributed list is a tuple where the first part is the path of the originating file and the second part is the "value".