java - Custom map-reduce input formatter for Cassandra using native protocol -
I am using Apache Cassandra (1.2) and Apache Maps- to reduce some data, minimize I currently I use org.apache.cassandra.hadoop.cql3 to CqlPagingInputFormat . This provider uses Thief to draw data It seems that the thrift is quite slow (300M records, 3 node clusters take 8 hours to read), and since the original binary protocol is present, then I wonder Has anyone used it?
Any other optimization and configuration is interested in tweaks - this is a different issue.
My questions are
-
What is the implementation of a map - reduce the input format which directly use the Cassandra Basic Protocol?
-
If not, to use the datestacks driver, what will be the first steps to writing your own?
Organization>
apache.cassandra.hadoop.cql3.CqlInputFormat org.apache.cassandra.hadoop.cql3.CqlRecordReader org.apache.cassandra.hadoop.cql3.CqlConfigHelper
Example of / Word code code in hasoop_cql3_word_count These classes are updated to use.
It has been started
Comments
Post a Comment