mapreudce 通过读取hbase表删除hbase 数据 -

月光杯

浏览: 377025 次
性别:
来自: 上海

最近访客更多访客>>

regicide

jybzjf

tangang

libo_591

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

mapreudce 通过读取hbase表删除hbase 数据

博客分类：

hadoop

package foo.bar.MR;

import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.KeyValue;
import org.apache.hadoop.hbase.client.Delete;
import org.apache.hadoop.hbase.client.HTable;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.client.Scan;
import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
import org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil;
import org.apache.hadoop.hbase.mapreduce.TableMapper;
import org.apache.hadoop.hbase.util.Bytes;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.output.NullOutputFormat;

import foo.bar.validate.Const;

public class DropRowByTimeStampMapReduce  {
	
	   public static Configuration configuration;
	   public static List<String> rowkeyList = new ArrayList<String>();
	   public static List<String> qualifierList = new ArrayList<String>();
	   static {  
	        configuration = HBaseConfiguration.create(); 
	        configuration.set("hbase.zookeeper.quorum",Const.ZOOKEEPER_QUORAM);  
	        configuration.set("hbase.rootdir", Const.HBASE_ROOTDIR);  
	    }  
	    
	static class MyMapper extends TableMapper<Text, LongWritable> {
		public void map(ImmutableBytesWritable row, Result r, Context context)
				throws InterruptedException, IOException {
			String tableName = context.getConfiguration().get("tableName");
			HTable htbl = new HTable(configuration, tableName);
			List<Delete> lists = new ArrayList<Delete>();
			for (KeyValue kv : r.raw()) {
				Delete dlt = new Delete(kv.getRow());
				dlt.deleteColumn(kv.getFamily(), kv.getQualifier(), kv.getTimestamp());
				lists.add(dlt);
				System.out.println("delete-- gv:"+Bytes.toString(kv.getRow())+",family:"+Bytes.toString(kv.getFamily())+",qualifier:"+Bytes.toString(kv.getQualifier())+",timestamp:"+kv.getTimestamp());
			}
			htbl.delete(lists);
			htbl.flushCommits();
			htbl.close();
		}
	}
		   
	   
	    public static void main(String[] args) throws Exception {  
	    	if(args.length!=2){
	    		return ;
	    	}
	    	String tableName = args[0];
	    	String timeStamp = args[1];
	    	Configuration config = HBaseConfiguration.create();
	    	config.set("tableName", tableName);
	    	Job job = new Job(config, "ExampleRead");
	    	job.setJarByClass(DropRowByTimeStamp.class);     // class that contains mapper
	    		
	    	Scan scan = new Scan();
	    	scan.setCaching(500);        // 1 is the default in Scan, which will be bad for MapReduce jobs
	    	scan.setCacheBlocks(false);  // don't set to true for MR jobs
	        scan.setTimeStamp(new Long(timeStamp));
	    	  
	    	TableMapReduceUtil.initTableMapperJob(
	    	  tableName,        // input HBase table name
	    	  scan,             // Scan instance to control CF and attribute selection
	    	  MyMapper.class,   // mapper
	    	  null,             // mapper output key 
	    	  null,             // mapper output value
	    	  job);
	    	job.setOutputFormatClass(NullOutputFormat.class);   // because we aren't emitting anything from mapper
	    		    
	    	boolean b = job.waitForCompletion(true);
	    	if (!b) {
	    	  throw new IOException("error with job!");
	    	}
	    }  
	    
	   
}

分享到：

修改Hadoop集群的备份数 | hadoop作业reduce过程调优使用到的参数笔记

2013-11-24 22:01
浏览 2247
评论(1)
分类:编程语言
查看更多

1 楼 tonyyan 2018-05-10

谢谢分享！

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

mapreudce 通过读取hbase表删除hbase 数据

评论

发表评论

相关推荐

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

mapreudce 通过读取hbase表删除hbase 数据

评论

发表评论

相关推荐

hadoop 节点时间同步

hadoop1.2.1+zookeeper-3.4.5+hbase-0.94.1集群安装

HDFS的基本概念

使用Ganglia监控Hadoop集群

zookeeper技术浅析

编译 hadoop 2.2.0

用Sqoop把数据从HDFS导入到MYSQL

MapReduce列子WordCount

修改Hadoop集群的备份数

hadoop作业reduce过程调优使用到的参数笔记

Hadoop命令大全

hadoop 添加删除datanode及tasktracker

使用hive读取hbase数据

error: java.io.IOException: File /opt/data/hadoop/mapred/mrsystem/jobtracker.inf

hive 三种启动方式及用途，关注通过jdbc连接的启动

hive 语法

hive初始化访问mysql权限问题

hadoop 常见错误

最近访客更多访客>>