checkpoints
Sequence File I/O
Extract a Double from a String.
Extract a Float from a String.
Extract a Long from a String.
Extract an Int from a String.
Avro I/O
Create a new DList from the contents of a list of one or more Avro files.
Create a new DList from the contents of a list of one or more Avro files. The type of the DList must conform to the schema types allowed by Avro, as constrained by the 'AvroSchema' type class. In the case of a directory being specified, the input forms all the files in that directory.
Create a new DList from the contents of one or more Avro files.
Create a new DList from the contents of one or more Avro files. The type of the DList must conform to the schema types allowed by Avro, as constrained by the 'AvroSchema' type class. In the case of a directory being specified, the input forms all the files in that directory.
Create a distributed list from a text file that is a number of fields delimited by some separator.
Create a distributed list from a text file that is a number of fields delimited by some separator. Use an extractor function to pull out the required fields to create the distributed list.
create a DList from a stream of elements which will only be evaluated on the cluster
Create a new DList from the contents of a list of one or more Sequence Files.
Create a new DList from the contents of a list of one or more Sequence Files. Note that the type parameters K and V are the "converted" Scala types for the Writable key-value types that must be contained in the the Sequence Files. In the case of a directory being specified, the input forms all the files in that directory.
Create a distributed list from one or more files or directories (in the case of a directory, the input forms all files in that directory).
Create a distributed list from one or more files or directories (in the case of a directory, the input forms all files in that directory).
Create a distributed list from one or more files or directories (in the case of a directory, the input forms all files in that directory).
Create a distributed list from one or more files or directories (in the case of a directory, the input forms all files in that directory). The distributed list is a tuple where the first part is the path of the originating file and the second part is a line of text.
Create a distributed list from a list of one or more files or directories (in the case of a directory, the input forms all files in that directory).
Create a distributed list from a list of one or more files or directories (in the case of a directory, the input forms all files in that directory). The distributed list is a tuple where the first part is the path of the originating file and the second part is a line of text.
Create a new DList from the "key" contents of a list of one or more Sequence Files.
Create a new DList from the "key" contents of a list of one or more Sequence Files. Note that the type parameter K is the "converted" Scala type for the Writable key type that must be contained in the the Sequence Files. In the case of a directory being specified, the input forms all the files in that directory.
Create a new DList from the "key" contents of one or more Sequence Files.
Create a new DList from the "key" contents of one or more Sequence Files. Note that the type parameter K is the "converted" Scala type for the Writable key type that must be contained in the the Sequence Files. In the case of a directory being specified, the input forms all the files in that directory.
Specify a distributed list to be persistent by converting its elements to Writables and storing it to disk as the "key" component in a Sequence File.
Specify a distributed list to be persistent by converting its elements to Writables and storing it to disk as the "key" component in a Sequence File.
create a DObject which will only be evaluated on the cluster
Persist a distributed lists of 'Products' (e.
Persist a distributed lists of 'Products' (e.g. Tuples) as a delimited text file.
Text file I/O
Persist a distributed object of 'Products' (e.
Persist a distributed object of 'Products' (e.g. Tuples) as a delimited text file.
create a text source
create a text source
Specify a distributed list to be persistent by storing it to disk as an Avro File
Specify a distributed list to be persistent by storing it to disk as an Avro File
Specify a distributed list to be persistent by converting its elements to Writables and storing it to disk as "key-values" in a Sequence File
Specify a distributed list to be persistent by converting its elements to Writables and storing it to disk as "key-values" in a Sequence File
Persist a distributed list as a text file.
Persist a distributed list as a text file.
Create a new DList from the "value" contents of a list of one or more Sequence Files.
Create a new DList from the "value" contents of a list of one or more Sequence Files. Note that the type parameter V is the "converted" Scala type for the Writable value type that must be contained in the the Sequence Files. In the case of a directory being specified, the input forms all the files in that directory.
Create a new DList from the "value" contents of one or more Sequence Files.
Create a new DList from the "value" contents of one or more Sequence Files. Note that the type parameter V is the "converted" Scala type for the Writable value type that must be contained in the the Sequence Files. In the case of a directory being specified, the input forms all the files in that directory.
Specify a distributed list to be persistent by converting its elements to Writables and storing it to disk as the "value" component in a Sequence File.
Specify a distributed list to be persistent by converting its elements to Writables and storing it to disk as the "value" component in a Sequence File.
This trait provides way to create DLists from files and to add sinks to DLists so that the results of computations can be saved to files