Definition of the Cluster addresses: FileSystem + JobTracker
Implementation of the Cluster trait taking the configuration from a ScoobiConfiguration object
This object provides a set of operations to create distributed lists.
Implicit conversions for DObjects
This trait provides methods to execute map-reduce code, either locally or on the cluster.
Log factory used for testing.
Log factory used for testing.
It doesn't display any log message by default, unless the QUIET attribute is set to true.
It can display SCOOBI_TIMES messages if the SHOW_TIMES attributes is true
This trait provides way to create DLists from files and to add sinks to DLists so that the results of computations can be saved to files
This trait defines:
This trait defines:
- the library jars which can be uploaded to the cluster - a method to upload and reference them on the classpath for cluster jobs
Execute Hadoop code locally
This trait provides operations and implicit definitions to persist DLists and DObjects.
This trait provides operations and implicit definitions to persist DLists and DObjects.
Generally persist
on a DList or a DObject executes the computations but doesn't return any value while run
return values, triggering the computations if necessary (but not if persist has already been called)
Several DLists and DObjects which are part of the same logical computation graph can be persisted jointly by calling
persist(objects and lists)
This trait can be extended to create an application running Scoobi code.
This trait can be extended to create an application running Scoobi code.
Command-line arguments are available in the args attribute (minus the hadoop specific ones) and a default implicit ScoobiConfiguration is also accessible to create DLists.
A ScoobiApp will be used in 2 different contexts:
In that case you will use hadoop default configuration files or you will need to tell this script where to find the configuration files.
2. within sbt
In that case the cluster location can be either defined by:
Then, if it can be determined that the execution will not be local but on the cluster (@see locally), the ScoobiApp
trait will attempt to load the dependent jars to the libjars directory on the cluster
(if not already there, @see LibJars for the details). This behavior can be switched off by overriding the upload
method: override def upload = false
or by passing the 'nolibjars' argument on the command line
This trait provides a ScoobiConfiguration object initialised with the configuration files found in the $HADOOP_HOME/conf directory.
This trait provides a ScoobiConfiguration object initialised with the configuration files found in the $HADOOP_HOME/conf directory.
If the ScoobiArgs indicates a local execution the configuration files are not added to the Hadoop configuration object because it wouldn't be possible to remove them afterwards.
This trait defines all the options which can be used to modify the behavior of a Scoobi application
This trait can be mixed in an Application trait in order to store command-line arguments before any code is executed
definition of the interpreter loop
This trait can be extended if you want to provide alternative imports or behaviors
Implementation of the ScoobiArgs trait taking the values from the command line arguments
This object provides a set of operations to create distributed objects.
A REPL for Scoobi.
A REPL for Scoobi.
Run the 'scoobi' script, which will bring you into the Scala REPL
You're now good to go!!
This trait provides methods to execute map-reduce code, either locally or on the cluster.
LocalHadoop
To use this trait, you need to provide: