|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.infinispan.distexec.mapreduce.MapReduceTask<KIn,VIn,KOut,VOut>
public class MapReduceTask<KIn,VIn,KOut,VOut>
MapReduceTask is a distributed task allowing a large scale computation to be transparently parallelized across Infinispan cluster nodes.
Users should instantiate MapReduceTask with a reference to a cache whose data is used as input for this
task. Infinispan execution environment will migrate and execute instances of provided Mapper
and Reducer
seamlessly across Infinispan nodes.
Unless otherwise specified using onKeys(Object...)
filter all available
key/value pairs of a specified cache will be used as input data for this task.
For example, MapReduceTask that counts number of word occurrences in a particular cache where
keys and values are String instances could be written as follows:
MapReduceTask<String, String, String, Integer> task = new MapReduceTask<String, String, String, Integer>(cache); task.mappedWith(new WordCountMapper()).reducedWith(new WordCountReducer()); Map<String, Integer> results = task.execute();The final result is a map where key is a word and value is a word count for that particular word.
Accompanying Mapper
and Reducer
are defined as follows:
private static class WordCountMapper implements Mapper<String, String, String,Integer> { public void map(String key, String value, Collector<String, Integer> collector) { StringTokenizer tokens = new StringTokenizer(value); while (tokens.hasMoreElements()) { String s = (String) tokens.nextElement(); collector.emit(s, 1); } } } private static class WordCountReducer implements Reducer<String, Integer> { public Integer reduce(String key, Iterator<Integer> iter) { int sum = 0; while (iter.hasNext()) { Integer i = (Integer) iter.next(); sum += i; } return sum; } }Note that
Mapper
and Reducer
should not be specified as inner classes. Inner classes
declared in non-static contexts contain implicit non-transient references to enclosing class instances,
serializing such an inner class instance will result in serialization of its associated outer class instance as well.
If you are not familiar with concept of map reduce distributed execution model start with Google's MapReduce research paper.
Field Summary | |
---|---|
protected Marshaller |
marshaller
|
Constructor Summary | |
---|---|
MapReduceTask(Cache<KIn,VIn> masterCacheNode)
Create a new MapReduceTask given a master cache node. |
Method Summary | ||
---|---|---|
protected Mapper<KIn,VIn,KOut,VOut> |
clone(Mapper<KIn,VIn,KOut,VOut> mapper)
|
|
protected Reducer<KOut,VOut> |
clone(Reducer<KOut,VOut> reducer)
|
|
Map<KOut,VOut> |
execute()
Executes this task across Infinispan cluster nodes. |
|
|
execute(Collator<KOut,VOut,R> collator)
Executes this task across Infinispan cluster but the final result is collated using specified Collator |
|
Future<Map<KOut,VOut>> |
executeAsynchronously()
Executes this task across Infinispan cluster nodes asynchronously. |
|
|
executeAsynchronously(Collator<KOut,VOut,R> collator)
Executes this task asynchronously across Infinispan cluster; final result is collated using specified Collator and wrapped by Future |
|
protected void |
groupKeys(Map<KOut,List<VOut>> finalReduced,
Map<KOut,VOut> mapReceived)
|
|
protected Map<Address,List<KIn>> |
mapKeysToNodes()
|
|
MapReduceTask<KIn,VIn,KOut,VOut> |
mappedWith(Mapper<KIn,VIn,KOut,VOut> mapper)
Specifies Mapper to use for this MapReduceTask |
|
MapReduceTask<KIn,VIn,KOut,VOut> |
onKeys(KIn... input)
Rather than use all available keys as input onKeys allows users to specify a
subset of keys as input to this task |
|
MapReduceTask<KIn,VIn,KOut,VOut> |
reducedWith(Reducer<KOut,VOut> reducer)
Specifies Reducer to use for this MapReduceTask |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected final Marshaller marshaller
Constructor Detail |
---|
public MapReduceTask(Cache<KIn,VIn> masterCacheNode)
masterCacheNode
- cache node initiating map reduce taskMethod Detail |
---|
public MapReduceTask<KIn,VIn,KOut,VOut> onKeys(KIn... input)
onKeys
allows users to specify a
subset of keys as input to this task
input
- input keys for this task
public MapReduceTask<KIn,VIn,KOut,VOut> mappedWith(Mapper<KIn,VIn,KOut,VOut> mapper)
Note that Mapper
should not be specified as inner class. Inner classes declared in
non-static contexts contain implicit non-transient references to enclosing class instances,
serializing such an inner class instance will result in serialization of its associated outer
class instance as well.
mapper
-
public MapReduceTask<KIn,VIn,KOut,VOut> reducedWith(Reducer<KOut,VOut> reducer)
Note that Reducer
should not be specified as inner class. Inner classes declared in
non-static contexts contain implicit non-transient references to enclosing class instances,
serializing such an inner class instance will result in serialization of its associated outer
class instance as well.
reducer
-
public Map<KOut,VOut> execute() throws CacheException
CacheException
public Future<Map<KOut,VOut>> executeAsynchronously()
public <R> R execute(Collator<KOut,VOut,R> collator)
Collator
collator
- a Collator to use
public <R> Future<R> executeAsynchronously(Collator<KOut,VOut,R> collator)
Collator
and wrapped by Future
collator
- a Collator to use
protected void groupKeys(Map<KOut,List<VOut>> finalReduced, Map<KOut,VOut> mapReceived)
protected Map<Address,List<KIn>> mapKeysToNodes()
protected Mapper<KIn,VIn,KOut,VOut> clone(Mapper<KIn,VIn,KOut,VOut> mapper)
protected Reducer<KOut,VOut> clone(Reducer<KOut,VOut> reducer)
|
--> | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |