Perform an equijoin using block-join (aka replicate fragment join).
Perform an equijoin using block-join (aka replicate fragment join).
Replicate the small (left) side n times including the id of the replica in the key. On the right side, add a random integer from 0...n-1 to the key. Join using the pseudo-key and strip out the extra fields.
Useful for skewed join keys and large datasets.
Perform a co-group of two (2) distributed lists
Perform an equijoin of two (2) distributed lists.
Perform a full outer-join of two distributed lists.
Perform a full outer-join of two distributed lists. The default function specifies how to construct a A or B when there is none
Perform a full outer-join of two distributed lists.
Perform a full outer-join of two distributed lists. The default function specifies how to construct a A or B when there is none
Perform a left outer-join of two (2) distributed lists.
Perform a left outer-join of two (2) distributed lists. Note the return type of Option[B] for when there is no value in the right dlist (d1).
Perform a right outer-join of two (2) distributed lists.
Perform a right outer-join of two (2) distributed lists. Note the return type of Option[A] as when there is no value in the left dlist (d1) for a value on the right dlist (d2), it will return none.
Perform a left outer-join of two (2) distributed lists.
Perform a left outer-join of two (2) distributed lists. Note the return type of Option[B] for when there is no value in the right dlist (d1).