博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
项目实战-友盟项目介绍以及环境搭建
阅读量:5862 次
发布时间:2019-06-19

本文共 46746 字,大约阅读时间需要 155 分钟。

            项目实战-友盟项目介绍以及环境搭建

                                 作者:尹正杰

版权声明:原创作品,谢绝转载!否则将追究法律责任。

 

 

  

 

一.项目架构介绍

 

 

二.环境搭建 

1>.搭建Nginx反向代理

   参考笔记:https://www.cnblogs.com/yinzhengjie/p/9428404.html

2>.启动hadoop集群

[yinzhengjie@s101 ~]$ start-dfs.sh SLF4J: Class path contains multiple SLF4J bindings.SLF4J: Found binding in [jar:file:/soft/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/soft/apache-hive-2.1.1-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]Starting namenodes on [s101 s105]s101: starting namenode, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-namenode-s101.outs105: starting namenode, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-namenode-s105.outs104: starting datanode, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-datanode-s104.outs102: starting datanode, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-datanode-s102.outs103: starting datanode, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-datanode-s103.outStarting journal nodes [s102 s103 s104]s102: starting journalnode, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-journalnode-s102.outs103: starting journalnode, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-journalnode-s103.outs104: starting journalnode, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-journalnode-s104.outSLF4J: Class path contains multiple SLF4J bindings.SLF4J: Found binding in [jar:file:/soft/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/soft/apache-hive-2.1.1-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]Starting ZK Failover Controllers on NN hosts [s101 s105]s101: starting zkfc, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-zkfc-s101.outs105: starting zkfc, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-zkfc-s105.out[yinzhengjie@s101 ~]$ [yinzhengjie@s101 ~]$ xcall.sh jps============= s101 jps ============25169 Application27156 NameNode27972 DFSZKFailoverController28029 Jps24542 ConsoleConsumer命令执行成功============= s102 jps ============8609 QuorumPeerMain11345 Jps11110 JournalNode8999 Kafka11036 DataNode命令执行成功============= s103 jps ============7444 Kafka7753 JournalNode7100 QuorumPeerMain7676 DataNode7951 Jps命令执行成功============= s104 jps ============6770 QuorumPeerMain7109 Kafka7336 DataNode7610 Jps7419 JournalNode命令执行成功============= s105 jps ============19397 NameNode19255 DFSZKFailoverController19535 Jps命令执行成功[yinzhengjie@s101 ~]$
启动hdfs分布式文件系统([yinzhengjie@s101 ~]$ start-dfs.sh )
[yinzhengjie@s101 ~]$ hdfs dfs -mkdir -p /home/yinzhengjie/data/logs/umeng/raw-logSLF4J: Class path contains multiple SLF4J bindings.SLF4J: Found binding in [jar:file:/soft/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/soft/apache-hive-2.1.1-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory][yinzhengjie@s101 ~]$[yinzhengjie@s101 ~]$ hdfs dfs -ls -R /home/yinzhengjie/data/logsSLF4J: Class path contains multiple SLF4J bindings.SLF4J: Found binding in [jar:file:/soft/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/soft/apache-hive-2.1.1-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]drwxr-xr-x   - yinzhengjie supergroup          0 2018-08-06 23:36 /home/yinzhengjie/data/logs/umengdrwxr-xr-x   - yinzhengjie supergroup          0 2018-08-06 23:36 /home/yinzhengjie/data/logs/umeng/raw-log[yinzhengjie@s101 ~]$
在hdfs中创建存放日志文件的目录([yinzhengjie@s101 ~]$ hdfs dfs -mkdir -p /home/yinzhengjie/data/logs/umeng/raw-log)
[yinzhengjie@s101 ~]$ start-yarn.sh starting yarn daemonss101: starting resourcemanager, logging to /soft/hadoop-2.7.3/logs/yarn-yinzhengjie-resourcemanager-s101.outs105: starting resourcemanager, logging to /soft/hadoop-2.7.3/logs/yarn-yinzhengjie-resourcemanager-s105.outs102: starting nodemanager, logging to /soft/hadoop-2.7.3/logs/yarn-yinzhengjie-nodemanager-s102.outs104: starting nodemanager, logging to /soft/hadoop-2.7.3/logs/yarn-yinzhengjie-nodemanager-s104.outs103: starting nodemanager, logging to /soft/hadoop-2.7.3/logs/yarn-yinzhengjie-nodemanager-s103.out[yinzhengjie@s101 ~]$ [yinzhengjie@s101 ~]$ xcall.sh jps============= s101 jps ============25169 Application29281 ResourceManager27156 NameNode27972 DFSZKFailoverController30103 Jps28523 Application24542 ConsoleConsumer命令执行成功============= s102 jps ============8609 QuorumPeerMain11110 JournalNode8999 Kafka12343 Jps11897 NodeManager11036 DataNode命令执行成功============= s103 jps ============8369 Jps7444 Kafka7753 JournalNode8091 NodeManager7100 QuorumPeerMain7676 DataNode命令执行成功============= s104 jps ============6770 QuorumPeerMain7746 NodeManager8018 Jps7109 Kafka7336 DataNode7419 JournalNode命令执行成功============= s105 jps ============19956 NodeManager19397 NameNode20293 Jps19255 DFSZKFailoverController命令执行成功[yinzhengjie@s101 ~]$
启动yarn资源调度器用于支持hive的计算([yinzhengjie@s101 ~]$ start-yarn.sh )

3>.kafka配置

[yinzhengjie@s101 ~]$ more `which xcall.sh`#!/bin/bash#@author :yinzhengjie#blog:http://www.cnblogs.com/yinzhengjie#EMAIL:y1053419035@qq.com#判断用户是否传参if [ $# -lt 1 ];then        echo "请输入参数"        exitfi#获取用户输入的命令cmd=$@for (( i=101;i<=105;i++ ))do        #使终端变绿色         tput setaf 2        echo ============= s$i $cmd ============        #使终端变回原来的颜色,即白灰色        tput setaf 7        #远程执行命令        ssh s$i $cmd        #判断命令是否执行成功        if [ $? == 0 ];then                echo "命令执行成功"        fidone[yinzhengjie@s101 ~]$
编写批处理执行脚本([yinzhengjie@s101 ~]$ more `which xcall.sh` )
[yinzhengjie@s101 ~]$ more `which xzk.sh`#!/bin/bash#@author :yinzhengjie#blog:http://www.cnblogs.com/yinzhengjie#EMAIL:y1053419035@qq.com#判断用户是否传参if [ $# -ne 1 ];then    echo "无效参数,用法为: $0  {start|stop|restart|status}"    exitfi#获取用户输入的命令cmd=$1#定义函数功能function zookeeperManger(){    case $cmd in    start)        echo "启动服务"                remoteExecution start        ;;    stop)        echo "停止服务"        remoteExecution stop        ;;    restart)        echo "重启服务"        remoteExecution restart        ;;    status)        echo "查看状态"        remoteExecution status        ;;    *)        echo "无效参数,用法为: $0  {start|stop|restart|status}"        ;;    esac}#定义执行的命令function remoteExecution(){    for (( i=102 ; i<=104 ; i++ )) ; do            tput setaf 2            echo ========== s$i zkServer.sh  $1 ================            tput setaf 9            ssh s$i  "source /etc/profile ; zkServer.sh $1"    done}#调用函数zookeeperManger[yinzhengjie@s101 ~]$
zookeeper启动脚本([yinzhengjie@s101 ~]$ more `which xzk.sh` )
[yinzhengjie@s101 ~]$ xzk.sh start启动服务========== s102 zkServer.sh start ================ZooKeeper JMX enabled by defaultUsing config: /soft/zk/bin/../conf/zoo.cfgStarting zookeeper ... STARTED========== s103 zkServer.sh start ================ZooKeeper JMX enabled by defaultUsing config: /soft/zk/bin/../conf/zoo.cfgStarting zookeeper ... STARTED========== s104 zkServer.sh start ================ZooKeeper JMX enabled by defaultUsing config: /soft/zk/bin/../conf/zoo.cfgStarting zookeeper ... STARTED[yinzhengjie@s101 ~]$ [yinzhengjie@s101 ~]$ xcall.sh jps============= s101 jps ============23771 Jps命令执行成功============= s102 jps ============8609 QuorumPeerMain8639 Jps命令执行成功============= s103 jps ============7100 QuorumPeerMain7135 Jps命令执行成功============= s104 jps ============6770 QuorumPeerMain6799 Jps命令执行成功============= s105 jps ============18932 Jps命令执行成功[yinzhengjie@s101 ~]$
启动zookeeper集群([yinzhengjie@s101 ~]$ xzk.sh start)
[yinzhengjie@s101 ~]$ more `which xkafka.sh`#!/bin/bash#@author :yinzhengjie#blog:http://www.cnblogs.com/yinzhengjie#EMAIL:y1053419035@qq.com#判断用户是否传参if [ $# -ne 1 ];then    echo "无效参数,用法为: $0  {start|stop}"    exitfi#获取用户输入的命令cmd=$1for (( i=102 ; i<=104 ; i++ )) ; do    tput setaf 2    echo ========== s$i  $cmd ================    tput setaf 9    case $cmd in        start)             ssh s$i  "source /etc/profile ; kafka-server-start.sh -daemon /soft/kafka/config/server.properties"             echo  s$i  "服务已启动"            ;;        stop)             ssh s$i  "source /etc/profile ; kafka-server-stop.sh"             echo s$i  "服务已停止"            ;;            *)             echo "无效参数,用法为: $0  {start|stop}"            exit             ;;     esacdone[yinzhengjie@s101 ~]$
编写kafka启动脚本([yinzhengjie@s101 ~]$ more `which xkafka.sh`)
[yinzhengjie@s101 ~]$ xkafka.sh start========== s102 start ================s102 服务已启动========== s103 start ================s103 服务已启动========== s104 start ================s104 服务已启动[yinzhengjie@s101 ~]$ [yinzhengjie@s101 ~]$ [yinzhengjie@s101 ~]$ [yinzhengjie@s101 ~]$ xcall.sh jps============= s101 jps ============23921 Jps命令执行成功============= s102 jps ============8609 QuorumPeerMain8999 Kafka9068 Jps命令执行成功============= s103 jps ============7491 Jps7444 Kafka7100 QuorumPeerMain命令执行成功============= s104 jps ============6770 QuorumPeerMain7109 Kafka7176 Jps命令执行成功============= s105 jps ============18983 Jps命令执行成功[yinzhengjie@s101 ~]$
启动kafka集群([yinzhengjie@s101 ~]$ xkafka.sh start)
[yinzhengjie@s101 ~]$ kafka-topics.sh --zookeeper s102:2181 --create --topic yinzhengjie-umeng-raw-logs --replication-factor 3 --partitions 4Created topic "yinzhengjie-umeng-raw-logs".[yinzhengjie@s101 ~]$ [yinzhengjie@s101 ~]$  kafka-topics.sh --zookeeper s102:2181 --list__consumer_offsets__transaction_statet7t9testtopic1yinzhengjieyinzhengjie-umeng-raw-logs[yinzhengjie@s101 ~]$
创建主题([yinzhengjie@s101 ~]$ kafka-topics.sh --zookeeper s102:2181 --create --topic yinzhengjie-umeng-raw-logs --replication-factor 3 --partitions 4)
[yinzhengjie@s101 conf]$ kafka-console-consumer.sh --zookeeper s102:2181 --topic yinzhengjie-umeng-raw-logsUsing the ConsoleConsumer with old consumer is deprecated and will be removed in a future major release. Consider using the new consumer by passing [bootstrap-server] instead of [zookeeper].
启动控制台消费者([yinzhengjie@s101 conf]$ kafka-console-consumer.sh --zookeeper s102:2181 --topic yinzhengjie-umeng-raw-logs)

4>.flume配置

[yinzhengjie@s101 ~]$ more /soft/flume/conf/yinzhengjie-exec-umeng-nginx-to-kafka.confa1.sources = r1a1.channels = c1a1.sinks = k1a1.sources.r1.type = execa1.sources.r1.command = tail -F /usr/local/openresty/nginx/logs/access.loga1.channels.c1.type = memorya1.channels.c1.capacity = 10000a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSinka1.sinks.k1.kafka.topic = yinzhengjie-umeng-raw-logsa1.sinks.k1.kafka.bootstrap.servers = s102:9092a1.sinks.k1.kafka.flumeBatchSize = 20a1.sinks.k1.kafka.producer.acks = 1a1.sinks.k1.kafka.producer.linger.ms = 0a1.sources.r1.channels=c1a1.sinks.k1.channel=c1[yinzhengjie@s101 ~]$
编写flume收集nginx落地日志到kafka集群的配置文件([yinzhengjie@s101 ~]$ more /soft/flume/conf/yinzhengjie-exec-umeng-nginx-to-kafka.conf)
[yinzhengjie@s101 ~]$ flume-ng agent -f /soft/flume/conf/yinzhengjie-exec-umeng-nginx-to-kafka.conf -n a1Warning: No configuration directory set! Use --conf 
to override.Warning: JAVA_HOME is not set!Info: Including Hadoop libraries found via (/soft/hadoop/bin/hadoop) for HDFS accessInfo: Including HBASE libraries found via (/soft/hbase/bin/hbase) for HBASE accessInfo: Including Hive libraries found via () for Hive access+ exec /soft/jdk/bin/java -Xmx20m -cp '/soft/flume/lib/*:/soft/hadoop-2.7.3/etc/hadoop:/soft/hadoop-2.7.3/share/hadoop/common/lib/*:/soft/hadoop-2.7.3/share/hadoop/common/*:/soft/hadoop-2.7.3/share/hadoop/hdfs:/soft/hadoop-2.7.3/share/hadoop/hdfs/lib/*:/soft/hadoop-2.7.3/share/hadoop/hdfs/*:/soft/hadoop-2.7.3/share/hadoop/yarn/lib/*:/soft/hadoop-2.7.3/share/hadoop/yarn/*:/soft/hadoop-2.7.3/share/hadoop/mapreduce/lib/*:/soft/hadoop-2.7.3/share/hadoop/mapreduce/*:/contrib/capacity-scheduler/*.jar:/soft/hbase/bin/../conf:/soft/jdk//lib/tools.jar:/soft/hbase/bin/..:/soft/hbase/bin/../lib/activation-1.1.jar:/soft/hbase/bin/../lib/aopalliance-1.0.jar:/soft/hbase/bin/../lib/apacheds-i18n-2.0.0-M15.jar:/soft/hbase/bin/../lib/apacheds-kerberos-codec-2.0.0-M15.jar:/soft/hbase/bin/../lib/api-asn1-api-1.0.0-M20.jar:/soft/hbase/bin/../lib/api-util-1.0.0-M20.jar:/soft/hbase/bin/../lib/asm-3.1.jar:/soft/hbase/bin/../lib/avro-1.7.4.jar:/soft/hbase/bin/../lib/commons-beanutils-1.7.0.jar:/soft/hbase/bin/../lib/commons-beanutils-core-1.8.0.jar:/soft/hbase/bin/../lib/commons-cli-1.2.jar:/soft/hbase/bin/../lib/commons-codec-1.9.jar:/soft/hbase/bin/../lib/commons-collections-3.2.2.jar:/soft/hbase/bin/../lib/commons-compress-1.4.1.jar:/soft/hbase/bin/../lib/commons-configuration-1.6.jar:/soft/hbase/bin/../lib/commons-daemon-1.0.13.jar:/soft/hbase/bin/../lib/commons-digester-1.8.jar:/soft/hbase/bin/../lib/commons-el-1.0.jar:/soft/hbase/bin/../lib/commons-httpclient-3.1.jar:/soft/hbase/bin/../lib/commons-io-2.4.jar:/soft/hbase/bin/../lib/commons-lang-2.6.jar:/soft/hbase/bin/../lib/commons-logging-1.2.jar:/soft/hbase/bin/../lib/commons-math-2.2.jar:/soft/hbase/bin/../lib/commons-math3-3.1.1.jar:/soft/hbase/bin/../lib/commons-net-3.1.jar:/soft/hbase/bin/../lib/disruptor-3.3.0.jar:/soft/hbase/bin/../lib/findbugs-annotations-1.3.9-1.jar:/soft/hbase/bin/../lib/guava-12.0.1.jar:/soft/hbase/bin/../lib/guice-3.0.jar:/soft/hbase/bin/../lib/guice-servlet-3.0.jar:/soft/hbase/bin/../lib/hadoop-annotations-2.5.1.jar:/soft/hbase/bin/../lib/hadoop-auth-2.5.1.jar:/soft/hbase/bin/../lib/hadoop-client-2.5.1.jar:/soft/hbase/bin/../lib/hadoop-common-2.5.1.jar:/soft/hbase/bin/../lib/hadoop-hdfs-2.5.1.jar:/soft/hbase/bin/../lib/hadoop-mapreduce-client-app-2.5.1.jar:/soft/hbase/bin/../lib/hadoop-mapreduce-client-common-2.5.1.jar:/soft/hbase/bin/../lib/hadoop-mapreduce-client-core-2.5.1.jar:/soft/hbase/bin/../lib/hadoop-mapreduce-client-jobclient-2.5.1.jar:/soft/hbase/bin/../lib/hadoop-mapreduce-client-shuffle-2.5.1.jar:/soft/hbase/bin/../lib/hadoop-yarn-api-2.5.1.jar:/soft/hbase/bin/../lib/hadoop-yarn-client-2.5.1.jar:/soft/hbase/bin/../lib/hadoop-yarn-common-2.5.1.jar:/soft/hbase/bin/../lib/hadoop-yarn-server-common-2.5.1.jar:/soft/hbase/bin/../lib/hbase-annotations-1.2.6.jar:/soft/hbase/bin/../lib/hbase-annotations-1.2.6-tests.jar:/soft/hbase/bin/../lib/hbase-client-1.2.6.jar:/soft/hbase/bin/../lib/hbase-common-1.2.6.jar:/soft/hbase/bin/../lib/hbase-common-1.2.6-tests.jar:/soft/hbase/bin/../lib/hbase-examples-1.2.6.jar:/soft/hbase/bin/../lib/hbase-external-blockcache-1.2.6.jar:/soft/hbase/bin/../lib/hbase-hadoop2-compat-1.2.6.jar:/soft/hbase/bin/../lib/hbase-hadoop-compat-1.2.6.jar:/soft/hbase/bin/../lib/hbase-it-1.2.6.jar:/soft/hbase/bin/../lib/hbase-it-1.2.6-tests.jar:/soft/hbase/bin/../lib/hbase-prefix-tree-1.2.6.jar:/soft/hbase/bin/../lib/hbase-procedure-1.2.6.jar:/soft/hbase/bin/../lib/hbase-protocol-1.2.6.jar:/soft/hbase/bin/../lib/hbase-resource-bundle-1.2.6.jar:/soft/hbase/bin/../lib/hbase-rest-1.2.6.jar:/soft/hbase/bin/../lib/hbase-server-1.2.6.jar:/soft/hbase/bin/../lib/hbase-server-1.2.6-tests.jar:/soft/hbase/bin/../lib/hbase-shell-1.2.6.jar:/soft/hbase/bin/../lib/hbase-thrift-1.2.6.jar:/soft/hbase/bin/../lib/htrace-core-3.1.0-incubating.jar:/soft/hbase/bin/../lib/httpclient-4.2.5.jar:/soft/hbase/bin/../lib/httpcore-4.4.1.jar:/soft/hbase/bin/../lib/jackson-core-asl-1.9.13.jar:/soft/hbase/bin/../lib/jackson-jaxrs-1.9.13.jar:/soft/hbase/bin/../lib/jackson-mapper-asl-1.9.13.jar:/soft/hbase/bin/../lib/jackson-xc-1.9.13.jar:/soft/hbase/bin/../lib/jamon-runtime-2.4.1.jar:/soft/hbase/bin/../lib/jasper-compiler-5.5.23.jar:/soft/hbase/bin/../lib/jasper-runtime-5.5.23.jar:/soft/hbase/bin/../lib/javax.inject-1.jar:/soft/hbase/bin/../lib/java-xmlbuilder-0.4.jar:/soft/hbase/bin/../lib/jaxb-api-2.2.2.jar:/soft/hbase/bin/../lib/jaxb-impl-2.2.3-1.jar:/soft/hbase/bin/../lib/jcodings-1.0.8.jar:/soft/hbase/bin/../lib/jersey-client-1.9.jar:/soft/hbase/bin/../lib/jersey-core-1.9.jar:/soft/hbase/bin/../lib/jersey-guice-1.9.jar:/soft/hbase/bin/../lib/jersey-json-1.9.jar:/soft/hbase/bin/../lib/jersey-server-1.9.jar:/soft/hbase/bin/../lib/jets3t-0.9.0.jar:/soft/hbase/bin/../lib/jettison-1.3.3.jar:/soft/hbase/bin/../lib/jetty-6.1.26.jar:/soft/hbase/bin/../lib/jetty-sslengine-6.1.26.jar:/soft/hbase/bin/../lib/jetty-util-6.1.26.jar:/soft/hbase/bin/../lib/joni-2.1.2.jar:/soft/hbase/bin/../lib/jruby-complete-1.6.8.jar:/soft/hbase/bin/../lib/jsch-0.1.42.jar:/soft/hbase/bin/../lib/jsp-2.1-6.1.14.jar:/soft/hbase/bin/../lib/jsp-api-2.1-6.1.14.jar:/soft/hbase/bin/../lib/junit-4.12.jar:/soft/hbase/bin/../lib/leveldbjni-all-1.8.jar:/soft/hbase/bin/../lib/libthrift-0.9.3.jar:/soft/hbase/bin/../lib/log4j-1.2.17.jar:/soft/hbase/bin/../lib/metrics-core-2.2.0.jar:/soft/hbase/bin/../lib/MyHbase-1.0-SNAPSHOT.jar:/soft/hbase/bin/../lib/netty-all-4.0.23.Final.jar:/soft/hbase/bin/../lib/paranamer-2.3.jar:/soft/hbase/bin/../lib/phoenix-4.10.0-HBase-1.2-client.jar:/soft/hbase/bin/../lib/protobuf-java-2.5.0.jar:/soft/hbase/bin/../lib/servlet-api-2.5-6.1.14.jar:/soft/hbase/bin/../lib/servlet-api-2.5.jar:/soft/hbase/bin/../lib/slf4j-api-1.7.7.jar:/soft/hbase/bin/../lib/slf4j-log4j12-1.7.5.jar:/soft/hbase/bin/../lib/snappy-java-1.0.4.1.jar:/soft/hbase/bin/../lib/spymemcached-2.11.6.jar:/soft/hbase/bin/../lib/xmlenc-0.52.jar:/soft/hbase/bin/../lib/xz-1.0.jar:/soft/hbase/bin/../lib/zookeeper-3.4.6.jar:/soft/hadoop-2.7.3/etc/hadoop:/soft/hadoop-2.7.3/share/hadoop/common/lib/*:/soft/hadoop-2.7.3/share/hadoop/common/*:/soft/hadoop-2.7.3/share/hadoop/hdfs:/soft/hadoop-2.7.3/share/hadoop/hdfs/lib/*:/soft/hadoop-2.7.3/share/hadoop/hdfs/*:/soft/hadoop-2.7.3/share/hadoop/yarn/lib/*:/soft/hadoop-2.7.3/share/hadoop/yarn/*:/soft/hadoop-2.7.3/share/hadoop/mapreduce/lib/*:/soft/hadoop-2.7.3/share/hadoop/mapreduce/*::/soft/hive/lib/*:/contrib/capacity-scheduler/*.jar:/conf:/lib/*' -Djava.library.path=:/soft/hadoop-2.7.3/lib/native:/soft/hadoop-2.7.3/lib/native org.apache.flume.node.Application -f /soft/flume/conf/yinzhengjie-exec-umeng-nginx-to-kafka.conf -n a1SLF4J: Class path contains multiple SLF4J bindings.SLF4J: Found binding in [jar:file:/soft/apache-flume-1.8.0-bin/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/soft/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/soft/hbase-1.2.6/lib/phoenix-4.10.0-HBase-1.2-client.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/soft/hbase-1.2.6/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/soft/apache-hive-2.1.1-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.18/08/06 21:59:22 INFO node.PollingPropertiesFileConfigurationProvider: Configuration provider starting18/08/06 21:59:22 INFO node.PollingPropertiesFileConfigurationProvider: Reloading configuration file:/soft/flume/conf/yinzhengjie-exec-umeng-nginx-to-kafka.conf18/08/06 21:59:22 INFO conf.FlumeConfiguration: Processing:k118/08/06 21:59:22 INFO conf.FlumeConfiguration: Processing:k118/08/06 21:59:22 INFO conf.FlumeConfiguration: Added sinks: k1 Agent: a118/08/06 21:59:22 INFO conf.FlumeConfiguration: Processing:k118/08/06 21:59:22 INFO conf.FlumeConfiguration: Processing:k118/08/06 21:59:22 INFO conf.FlumeConfiguration: Processing:k118/08/06 21:59:22 INFO conf.FlumeConfiguration: Processing:k118/08/06 21:59:22 INFO conf.FlumeConfiguration: Processing:k118/08/06 21:59:22 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration for agents: [a1]18/08/06 21:59:22 INFO node.AbstractConfigurationProvider: Creating channels18/08/06 21:59:22 INFO channel.DefaultChannelFactory: Creating instance of channel c1 type memory18/08/06 21:59:22 INFO node.AbstractConfigurationProvider: Created channel c118/08/06 21:59:22 INFO source.DefaultSourceFactory: Creating instance of source r1, type exec18/08/06 21:59:22 INFO sink.DefaultSinkFactory: Creating instance of sink: k1, type: org.apache.flume.sink.kafka.KafkaSink18/08/06 21:59:22 INFO kafka.KafkaSink: Using the static topic yinzhengjie-umeng-raw-logs. This may be overridden by event headers18/08/06 21:59:22 INFO node.AbstractConfigurationProvider: Channel c1 connected to [r1, k1]18/08/06 21:59:22 INFO node.Application: Starting new configuration:{ sourceRunners:{r1=EventDrivenSourceRunner: { source:org.apache.flume.source.ExecSource{name:r1,state:IDLE} }} sinkRunners:{k1=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@6bcf9394 counterGroup:{ name:null counters:{} } }} channels:{c1=org.apache.flume.channel.MemoryChannel{name: c1}} }18/08/06 21:59:22 INFO node.Application: Starting Channel c118/08/06 21:59:22 INFO node.Application: Waiting for channel: c1 to start. Sleeping for 500 ms18/08/06 21:59:22 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: CHANNEL, name: c1: Successfully registered new MBean.18/08/06 21:59:22 INFO instrumentation.MonitoredCounterGroup: Component type: CHANNEL, name: c1 started18/08/06 21:59:23 INFO node.Application: Starting Sink k118/08/06 21:59:23 INFO node.Application: Starting Source r118/08/06 21:59:23 INFO source.ExecSource: Exec source starting with command: tail -F /usr/local/openresty/nginx/logs/access.log18/08/06 21:59:23 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: SOURCE, name: r1: Successfully registered new MBean.18/08/06 21:59:23 INFO instrumentation.MonitoredCounterGroup: Component type: SOURCE, name: r1 started18/08/06 21:59:23 INFO producer.ProducerConfig: ProducerConfig values: compression.type = none metric.reporters = [] metadata.max.age.ms = 300000 metadata.fetch.timeout.ms = 60000 reconnect.backoff.ms = 50 sasl.kerberos.ticket.renew.window.factor = 0.8 bootstrap.servers = [s102:9092] retry.backoff.ms = 100 sasl.kerberos.kinit.cmd = /usr/bin/kinit buffer.memory = 33554432 timeout.ms = 30000 key.serializer = class org.apache.kafka.common.serialization.StringSerializer sasl.kerberos.service.name = null sasl.kerberos.ticket.renew.jitter = 0.05 ssl.keystore.type = JKS ssl.trustmanager.algorithm = PKIX block.on.buffer.full = false ssl.key.password = null max.block.ms = 60000 sasl.kerberos.min.time.before.relogin = 60000 connections.max.idle.ms = 540000 ssl.truststore.password = null max.in.flight.requests.per.connection = 5 metrics.num.samples = 2 client.id = ssl.endpoint.identification.algorithm = null ssl.protocol = TLS request.timeout.ms = 30000 ssl.provider = null ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1] acks = 1 batch.size = 16384 ssl.keystore.location = null receive.buffer.bytes = 32768 ssl.cipher.suites = null ssl.truststore.type = JKS security.protocol = PLAINTEXT retries = 0 max.request.size = 1048576 value.serializer = class org.apache.kafka.common.serialization.ByteArraySerializer ssl.truststore.location = null ssl.keystore.password = null ssl.keymanager.algorithm = SunX509 metrics.sample.window.ms = 30000 partitioner.class = class org.apache.kafka.clients.producer.internals.DefaultPartitioner send.buffer.bytes = 131072 linger.ms = 018/08/06 21:59:23 INFO utils.AppInfoParser: Kafka version : 0.9.0.118/08/06 21:59:23 INFO utils.AppInfoParser: Kafka commitId : 23c69d62a0cabf0618/08/06 21:59:23 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: SINK, name: k1: Successfully registered new MBean.18/08/06 21:59:23 INFO instrumentation.MonitoredCounterGroup: Component type: SINK, name: k1 started
启动Flume服务端agent进程([yinzhengjie@s101 ~]$ flume-ng agent -f /soft/flume/conf/yinzhengjie-exec-umeng-nginx-to-kafka.conf -n a1)
[yinzhengjie@s101 ~]$ more /soft/flume/conf/yinzhengjie-exec-umeng-kafka-to-hdfs.confa1.sources = r1a1.channels = c1a1.sinks = k1        a1.sources.r1.type = org.apache.flume.source.kafka.KafkaSourcea1.sources.r1.batchSize = 5000a1.sources.r1.batchDurationMillis = 2000a1.sources.r1.kafka.bootstrap.servers = s102:9092a1.sources.r1.kafka.topics = yinzhengjie-umeng-raw-logsa1.sources.r1.kafka.consumer.group.id = g10a1.channels.c1.type=memorya1.sinks.k1.type = hdfs#目标目录 a1.sinks.k1.hdfs.path = /home/yinzhengjie/data/logs/umeng/raw-log/%Y%m/%d/%H%M#文件前缀a1.sinks.k1.hdfs.filePrefix = events-#round控制目录#是否允许目录环绕a1.sinks.k1.hdfs.round = true  #目录环绕的值a1.sinks.k1.hdfs.roundValue = 1#目录环绕的时间单位a1.sinks.k1.hdfs.roundUnit = minute#控制文件#滚动文件间隔(单位秒)a1.sinks.k1.hdfs.rollInterval = 30#滚动文件的大小(10K)a1.sinks.k1.hdfs.rollSize = 10240#滚动文件消息条数(500)a1.sinks.k1.hdfs.rollCount = 500#使用本地时间a1.sinks.k1.hdfs.useLocalTimeStamp = true#控制文件类型,DataStream是文本类型,默认是序列文件。a1.sinks.k1.hdfs.fileType = DataStreama1.sources.r1.channels=c1a1.sinks.k1.channel=c1[yinzhengjie@s101 ~]$
编写flume收集kafka集群日志到hdfs分布式文件系统进行存储([yinzhengjie@s101 ~]$ more /soft/flume/conf/yinzhengjie-exec-umeng-kafka-to-hdfs.conf)
[yinzhengjie@s101 ~]$ flume-ng agent -f /soft/flume/conf/yinzhengjie-exec-umeng-kafka-to-hdfs.conf -n a1Warning: No configuration directory set! Use --conf 
to override.Warning: JAVA_HOME is not set!Info: Including Hadoop libraries found via (/soft/hadoop/bin/hadoop) for HDFS accessInfo: Including HBASE libraries found via (/soft/hbase/bin/hbase) for HBASE accessInfo: Including Hive libraries found via () for Hive access+ exec /soft/jdk/bin/java -Xmx20m -cp '/soft/flume/lib/*:/soft/hadoop-2.7.3/etc/hadoop:/soft/hadoop-2.7.3/share/hadoop/common/lib/*:/soft/hadoop-2.7.3/share/hadoop/common/*:/soft/hadoop-2.7.3/share/hadoop/hdfs:/soft/hadoop-2.7.3/share/hadoop/hdfs/lib/*:/soft/hadoop-2.7.3/share/hadoop/hdfs/*:/soft/hadoop-2.7.3/share/hadoop/yarn/lib/*:/soft/hadoop-2.7.3/share/hadoop/yarn/*:/soft/hadoop-2.7.3/share/hadoop/mapreduce/lib/*:/soft/hadoop-2.7.3/share/hadoop/mapreduce/*:/contrib/capacity-scheduler/*.jar:/soft/hbase/bin/../conf:/soft/jdk//lib/tools.jar:/soft/hbase/bin/..:/soft/hbase/bin/../lib/activation-1.1.jar:/soft/hbase/bin/../lib/aopalliance-1.0.jar:/soft/hbase/bin/../lib/apacheds-i18n-2.0.0-M15.jar:/soft/hbase/bin/../lib/apacheds-kerberos-codec-2.0.0-M15.jar:/soft/hbase/bin/../lib/api-asn1-api-1.0.0-M20.jar:/soft/hbase/bin/../lib/api-util-1.0.0-M20.jar:/soft/hbase/bin/../lib/asm-3.1.jar:/soft/hbase/bin/../lib/avro-1.7.4.jar:/soft/hbase/bin/../lib/commons-beanutils-1.7.0.jar:/soft/hbase/bin/../lib/commons-beanutils-core-1.8.0.jar:/soft/hbase/bin/../lib/commons-cli-1.2.jar:/soft/hbase/bin/../lib/commons-codec-1.9.jar:/soft/hbase/bin/../lib/commons-collections-3.2.2.jar:/soft/hbase/bin/../lib/commons-compress-1.4.1.jar:/soft/hbase/bin/../lib/commons-configuration-1.6.jar:/soft/hbase/bin/../lib/commons-daemon-1.0.13.jar:/soft/hbase/bin/../lib/commons-digester-1.8.jar:/soft/hbase/bin/../lib/commons-el-1.0.jar:/soft/hbase/bin/../lib/commons-httpclient-3.1.jar:/soft/hbase/bin/../lib/commons-io-2.4.jar:/soft/hbase/bin/../lib/commons-lang-2.6.jar:/soft/hbase/bin/../lib/commons-logging-1.2.jar:/soft/hbase/bin/../lib/commons-math-2.2.jar:/soft/hbase/bin/../lib/commons-math3-3.1.1.jar:/soft/hbase/bin/../lib/commons-net-3.1.jar:/soft/hbase/bin/../lib/disruptor-3.3.0.jar:/soft/hbase/bin/../lib/findbugs-annotations-1.3.9-1.jar:/soft/hbase/bin/../lib/guava-12.0.1.jar:/soft/hbase/bin/../lib/guice-3.0.jar:/soft/hbase/bin/../lib/guice-servlet-3.0.jar:/soft/hbase/bin/../lib/hadoop-annotations-2.5.1.jar:/soft/hbase/bin/../lib/hadoop-auth-2.5.1.jar:/soft/hbase/bin/../lib/hadoop-client-2.5.1.jar:/soft/hbase/bin/../lib/hadoop-common-2.5.1.jar:/soft/hbase/bin/../lib/hadoop-hdfs-2.5.1.jar:/soft/hbase/bin/../lib/hadoop-mapreduce-client-app-2.5.1.jar:/soft/hbase/bin/../lib/hadoop-mapreduce-client-common-2.5.1.jar:/soft/hbase/bin/../lib/hadoop-mapreduce-client-core-2.5.1.jar:/soft/hbase/bin/../lib/hadoop-mapreduce-client-jobclient-2.5.1.jar:/soft/hbase/bin/../lib/hadoop-mapreduce-client-shuffle-2.5.1.jar:/soft/hbase/bin/../lib/hadoop-yarn-api-2.5.1.jar:/soft/hbase/bin/../lib/hadoop-yarn-client-2.5.1.jar:/soft/hbase/bin/../lib/hadoop-yarn-common-2.5.1.jar:/soft/hbase/bin/../lib/hadoop-yarn-server-common-2.5.1.jar:/soft/hbase/bin/../lib/hbase-annotations-1.2.6.jar:/soft/hbase/bin/../lib/hbase-annotations-1.2.6-tests.jar:/soft/hbase/bin/../lib/hbase-client-1.2.6.jar:/soft/hbase/bin/../lib/hbase-common-1.2.6.jar:/soft/hbase/bin/../lib/hbase-common-1.2.6-tests.jar:/soft/hbase/bin/../lib/hbase-examples-1.2.6.jar:/soft/hbase/bin/../lib/hbase-external-blockcache-1.2.6.jar:/soft/hbase/bin/../lib/hbase-hadoop2-compat-1.2.6.jar:/soft/hbase/bin/../lib/hbase-hadoop-compat-1.2.6.jar:/soft/hbase/bin/../lib/hbase-it-1.2.6.jar:/soft/hbase/bin/../lib/hbase-it-1.2.6-tests.jar:/soft/hbase/bin/../lib/hbase-prefix-tree-1.2.6.jar:/soft/hbase/bin/../lib/hbase-procedure-1.2.6.jar:/soft/hbase/bin/../lib/hbase-protocol-1.2.6.jar:/soft/hbase/bin/../lib/hbase-resource-bundle-1.2.6.jar:/soft/hbase/bin/../lib/hbase-rest-1.2.6.jar:/soft/hbase/bin/../lib/hbase-server-1.2.6.jar:/soft/hbase/bin/../lib/hbase-server-1.2.6-tests.jar:/soft/hbase/bin/../lib/hbase-shell-1.2.6.jar:/soft/hbase/bin/../lib/hbase-thrift-1.2.6.jar:/soft/hbase/bin/../lib/htrace-core-3.1.0-incubating.jar:/soft/hbase/bin/../lib/httpclient-4.2.5.jar:/soft/hbase/bin/../lib/httpcore-4.4.1.jar:/soft/hbase/bin/../lib/jackson-core-asl-1.9.13.jar:/soft/hbase/bin/../lib/jackson-jaxrs-1.9.13.jar:/soft/hbase/bin/../lib/jackson-mapper-asl-1.9.13.jar:/soft/hbase/bin/../lib/jackson-xc-1.9.13.jar:/soft/hbase/bin/../lib/jamon-runtime-2.4.1.jar:/soft/hbase/bin/../lib/jasper-compiler-5.5.23.jar:/soft/hbase/bin/../lib/jasper-runtime-5.5.23.jar:/soft/hbase/bin/../lib/javax.inject-1.jar:/soft/hbase/bin/../lib/java-xmlbuilder-0.4.jar:/soft/hbase/bin/../lib/jaxb-api-2.2.2.jar:/soft/hbase/bin/../lib/jaxb-impl-2.2.3-1.jar:/soft/hbase/bin/../lib/jcodings-1.0.8.jar:/soft/hbase/bin/../lib/jersey-client-1.9.jar:/soft/hbase/bin/../lib/jersey-core-1.9.jar:/soft/hbase/bin/../lib/jersey-guice-1.9.jar:/soft/hbase/bin/../lib/jersey-json-1.9.jar:/soft/hbase/bin/../lib/jersey-server-1.9.jar:/soft/hbase/bin/../lib/jets3t-0.9.0.jar:/soft/hbase/bin/../lib/jettison-1.3.3.jar:/soft/hbase/bin/../lib/jetty-6.1.26.jar:/soft/hbase/bin/../lib/jetty-sslengine-6.1.26.jar:/soft/hbase/bin/../lib/jetty-util-6.1.26.jar:/soft/hbase/bin/../lib/joni-2.1.2.jar:/soft/hbase/bin/../lib/jruby-complete-1.6.8.jar:/soft/hbase/bin/../lib/jsch-0.1.42.jar:/soft/hbase/bin/../lib/jsp-2.1-6.1.14.jar:/soft/hbase/bin/../lib/jsp-api-2.1-6.1.14.jar:/soft/hbase/bin/../lib/junit-4.12.jar:/soft/hbase/bin/../lib/leveldbjni-all-1.8.jar:/soft/hbase/bin/../lib/libthrift-0.9.3.jar:/soft/hbase/bin/../lib/log4j-1.2.17.jar:/soft/hbase/bin/../lib/metrics-core-2.2.0.jar:/soft/hbase/bin/../lib/MyHbase-1.0-SNAPSHOT.jar:/soft/hbase/bin/../lib/netty-all-4.0.23.Final.jar:/soft/hbase/bin/../lib/paranamer-2.3.jar:/soft/hbase/bin/../lib/phoenix-4.10.0-HBase-1.2-client.jar:/soft/hbase/bin/../lib/protobuf-java-2.5.0.jar:/soft/hbase/bin/../lib/servlet-api-2.5-6.1.14.jar:/soft/hbase/bin/../lib/servlet-api-2.5.jar:/soft/hbase/bin/../lib/slf4j-api-1.7.7.jar:/soft/hbase/bin/../lib/slf4j-log4j12-1.7.5.jar:/soft/hbase/bin/../lib/snappy-java-1.0.4.1.jar:/soft/hbase/bin/../lib/spymemcached-2.11.6.jar:/soft/hbase/bin/../lib/xmlenc-0.52.jar:/soft/hbase/bin/../lib/xz-1.0.jar:/soft/hbase/bin/../lib/zookeeper-3.4.6.jar:/soft/hadoop-2.7.3/etc/hadoop:/soft/hadoop-2.7.3/share/hadoop/common/lib/*:/soft/hadoop-2.7.3/share/hadoop/common/*:/soft/hadoop-2.7.3/share/hadoop/hdfs:/soft/hadoop-2.7.3/share/hadoop/hdfs/lib/*:/soft/hadoop-2.7.3/share/hadoop/hdfs/*:/soft/hadoop-2.7.3/share/hadoop/yarn/lib/*:/soft/hadoop-2.7.3/share/hadoop/yarn/*:/soft/hadoop-2.7.3/share/hadoop/mapreduce/lib/*:/soft/hadoop-2.7.3/share/hadoop/mapreduce/*::/soft/hive/lib/*:/contrib/capacity-scheduler/*.jar:/conf:/lib/*' -Djava.library.path=:/soft/hadoop-2.7.3/lib/native:/soft/hadoop-2.7.3/lib/native org.apache.flume.node.Application -f /soft/flume/conf/yinzhengjie-exec-umeng-kafka-to-hdfs.conf -n a1SLF4J: Class path contains multiple SLF4J bindings.SLF4J: Found binding in [jar:file:/soft/apache-flume-1.8.0-bin/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/soft/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/soft/hbase-1.2.6/lib/phoenix-4.10.0-HBase-1.2-client.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/soft/hbase-1.2.6/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/soft/apache-hive-2.1.1-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.18/08/06 23:42:46 INFO node.PollingPropertiesFileConfigurationProvider: Configuration provider starting18/08/06 23:42:46 INFO node.PollingPropertiesFileConfigurationProvider: Reloading configuration file:/soft/flume/conf/yinzhengjie-exec-umeng-kafka-to-hdfs.conf18/08/06 23:42:46 INFO conf.FlumeConfiguration: Processing:k118/08/06 23:42:46 INFO conf.FlumeConfiguration: Processing:k118/08/06 23:42:46 INFO conf.FlumeConfiguration: Added sinks: k1 Agent: a118/08/06 23:42:46 INFO conf.FlumeConfiguration: Processing:k118/08/06 23:42:46 INFO conf.FlumeConfiguration: Processing:k118/08/06 23:42:46 INFO conf.FlumeConfiguration: Processing:k118/08/06 23:42:46 INFO conf.FlumeConfiguration: Processing:k118/08/06 23:42:46 INFO conf.FlumeConfiguration: Processing:k118/08/06 23:42:46 INFO conf.FlumeConfiguration: Processing:k118/08/06 23:42:46 INFO conf.FlumeConfiguration: Processing:k118/08/06 23:42:46 INFO conf.FlumeConfiguration: Processing:k118/08/06 23:42:46 INFO conf.FlumeConfiguration: Processing:k118/08/06 23:42:46 INFO conf.FlumeConfiguration: Processing:k118/08/06 23:42:46 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration for agents: [a1]18/08/06 23:42:46 INFO node.AbstractConfigurationProvider: Creating channels18/08/06 23:42:46 INFO channel.DefaultChannelFactory: Creating instance of channel c1 type memory18/08/06 23:42:46 INFO node.AbstractConfigurationProvider: Created channel c118/08/06 23:42:46 INFO source.DefaultSourceFactory: Creating instance of source r1, type org.apache.flume.source.kafka.KafkaSource18/08/06 23:42:46 INFO sink.DefaultSinkFactory: Creating instance of sink: k1, type: hdfs18/08/06 23:42:47 INFO node.AbstractConfigurationProvider: Channel c1 connected to [r1, k1]18/08/06 23:42:47 INFO node.Application: Starting new configuration:{ sourceRunners:{r1=PollableSourceRunner: { source:org.apache.flume.source.kafka.KafkaSource{name:r1,state:IDLE} counterGroup:{ name:null counters:{} } }} sinkRunners:{k1=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@703b8ec0 counterGroup:{ name:null counters:{} } }} channels:{c1=org.apache.flume.channel.MemoryChannel{name: c1}} }18/08/06 23:42:47 INFO node.Application: Starting Channel c118/08/06 23:42:47 INFO node.Application: Waiting for channel: c1 to start. Sleeping for 500 ms18/08/06 23:42:47 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: CHANNEL, name: c1: Successfully registered new MBean.18/08/06 23:42:47 INFO instrumentation.MonitoredCounterGroup: Component type: CHANNEL, name: c1 started18/08/06 23:42:47 INFO node.Application: Starting Sink k118/08/06 23:42:47 INFO node.Application: Starting Source r118/08/06 23:42:47 INFO kafka.KafkaSource: Starting org.apache.flume.source.kafka.KafkaSource{name:r1,state:IDLE}...18/08/06 23:42:47 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: SINK, name: k1: Successfully registered new MBean.18/08/06 23:42:47 INFO instrumentation.MonitoredCounterGroup: Component type: SINK, name: k1 started18/08/06 23:42:47 INFO consumer.ConsumerConfig: ConsumerConfig values: metric.reporters = [] metadata.max.age.ms = 300000 value.deserializer = class org.apache.kafka.common.serialization.ByteArrayDeserializer group.id = g10 partition.assignment.strategy = [org.apache.kafka.clients.consumer.RangeAssignor] reconnect.backoff.ms = 50 sasl.kerberos.ticket.renew.window.factor = 0.8 max.partition.fetch.bytes = 1048576 bootstrap.servers = [s102:9092] retry.backoff.ms = 100 sasl.kerberos.kinit.cmd = /usr/bin/kinit sasl.kerberos.service.name = null sasl.kerberos.ticket.renew.jitter = 0.05 ssl.keystore.type = JKS ssl.trustmanager.algorithm = PKIX enable.auto.commit = false ssl.key.password = null fetch.max.wait.ms = 500 sasl.kerberos.min.time.before.relogin = 60000 connections.max.idle.ms = 540000 ssl.truststore.password = null session.timeout.ms = 30000 metrics.num.samples = 2 client.id = ssl.endpoint.identification.algorithm = null key.deserializer = class org.apache.kafka.common.serialization.StringDeserializer ssl.protocol = TLS check.crcs = true request.timeout.ms = 40000 ssl.provider = null ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1] ssl.keystore.location = null heartbeat.interval.ms = 3000 auto.commit.interval.ms = 5000 receive.buffer.bytes = 32768 ssl.cipher.suites = null ssl.truststore.type = JKS security.protocol = PLAINTEXT ssl.truststore.location = null ssl.keystore.password = null ssl.keymanager.algorithm = SunX509 metrics.sample.window.ms = 30000 fetch.min.bytes = 1 send.buffer.bytes = 131072 auto.offset.reset = latest18/08/06 23:42:47 INFO utils.AppInfoParser: Kafka version : 0.9.0.118/08/06 23:42:47 INFO utils.AppInfoParser: Kafka commitId : 23c69d62a0cabf0618/08/06 23:42:49 INFO kafka.SourceRebalanceListener: topic yinzhengjie-umeng-raw-logs - partition 3 assigned.18/08/06 23:42:49 INFO kafka.SourceRebalanceListener: topic yinzhengjie-umeng-raw-logs - partition 2 assigned.18/08/06 23:42:49 INFO kafka.SourceRebalanceListener: topic yinzhengjie-umeng-raw-logs - partition 1 assigned.18/08/06 23:42:49 INFO kafka.SourceRebalanceListener: topic yinzhengjie-umeng-raw-logs - partition 0 assigned.18/08/06 23:42:49 INFO kafka.KafkaSource: Kafka source r1 started.18/08/06 23:42:49 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: SOURCE, name: r1: Successfully registered new MBean.18/08/06 23:42:49 INFO instrumentation.MonitoredCounterGroup: Component type: SOURCE, name: r1 started
启动flume服务端agent进程([yinzhengjie@s101 ~]$ flume-ng agent -f /soft/flume/conf/yinzhengjie-exec-umeng-kafka-to-hdfs.conf -n a1)

5>.Hive配置

[yinzhengjie@s101 ~]$ hiveSLF4J: Class path contains multiple SLF4J bindings.SLF4J: Found binding in [jar:file:/soft/apache-hive-2.1.1-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/soft/hbase-1.2.6/lib/phoenix-4.10.0-HBase-1.2-client.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/soft/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.Logging initialized using configuration in file:/soft/apache-hive-2.1.1-bin/conf/hive-log4j2.properties Async: trueHive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.hive (default)> show databases;OKdatabase_namedefaultyinzhengjieTime taken: 1.013 seconds, Fetched: 2 row(s)hive (default)> use yinzhengjie;OKTime taken: 0.025 secondshive (yinzhengjie)> show tables;OKtab_namestudentteacherteacherbakteachercopyTime taken: 0.037 seconds, Fetched: 4 row(s)hive (yinzhengjie)> hive (yinzhengjie)> create table raw_logs(                  >     servertimems float ,                  >     servertimestr string,                  >     clientip string,                  >     clienttimems bigint,                  >     status int ,                  >     log string                   > )                  > PARTITIONED BY (ym int, day int , hm int)                  > ROW FORMAT DELIMITED                  > FIELDS TERMINATED BY '#'                  > LINES TERMINATED BY '\n'                  > STORED AS TEXTFILE;OKTime taken: 0.51 secondshive (yinzhengjie)> hive (yinzhengjie)> show tables;OKtab_nameraw_logsstudentteacherteacherbakteachercopyTime taken: 0.018 seconds, Fetched: 5 row(s)hive (yinzhengjie)>
创建hive表用于存放原生日志(创建表语句太长无法显示,点我查看详情)
hive (yinzhengjie)> load data inpath '/home/yinzhengjie/data/logs/umeng/raw-log/201808/06/2346' into table raw_logs partition(ym=201808 , day=06 ,hm=2346);Loading data to table yinzhengjie.raw_logs partition (ym=201808, day=6, hm=2346)OKTime taken: 1.846 secondshive (yinzhengjie)>
加载hdfs上的数据到hive原生表(hive (yinzhengjie)> load data inpath '/home/yinzhengjie/data/logs/umeng/raw-log/201808/06/2346' into table raw_logs partition(ym=201808 , day=06 ,hm=2346);)
hive (yinzhengjie)> select servertimems,clientip from raw_logs limit 3;OKservertimems    clientip1.53362432E9    127.0.0.11.53362432E9    127.0.0.11.53362432E9    127.0.0.1Time taken: 0.148 seconds, Fetched: 3 row(s)hive (yinzhengjie)>
验证hdfs数据是否加载成功(hive (yinzhengjie)> select servertimems,clientip from raw_logs limit 3;)
[yinzhengjie@s101 download]$ cat /home/yinzhengjie/download/umeng_create_logs_ddl.sqluse yinzhengjie ;--startuplogscreate table if not exists startuplogs(  appChannel             string ,   appId             string ,   appPlatform             string ,   appVersion             string ,   brand             string ,   carrier             string ,   country             string ,   createdAtMs             bigint ,   deviceId             string ,   deviceStyle             string ,   ipAddress             string ,   network             string ,   osType             string ,   province             string ,   screenSize             string ,   tenantId             string )partitioned by (ym int ,day int , hm int) stored as parquet ;--eventlogscreate table if not exists eventlogs(  appChannel             string ,   appId             string ,   appPlatform             string ,   appVersion             string ,   createdAtMs             bigint ,   deviceId             string ,   deviceStyle             string ,   eventDurationSecs             bigint ,   eventId             string ,   osType             string ,   tenantId             string )partitioned by (ym int ,day int , hm int) stored as parquet ;--errorlogscreate table if not exists errorlogs(  appChannel             string ,   appId             string ,   appPlatform             string ,   appVersion             string ,   createdAtMs             bigint ,   deviceId             string ,   deviceStyle             string ,   errorBrief             string ,   errorDetail             string ,   osType             string ,   tenantId             string )partitioned by (ym int ,day int , hm int) stored as parquet ;--usagelogscreate table if not exists usagelogs(  appChannel             string ,   appId             string ,   appPlatform             string ,   appVersion             string ,   createdAtMs             bigint ,   deviceId             string ,   deviceStyle             string ,   osType             string ,   singleDownloadTraffic             bigint ,   singleUploadTraffic             bigint ,   singleUseDurationSecs             bigint ,   tenantId             string )partitioned by (ym int ,day int , hm int) stored as parquet ;--pagelogscreate table if not exists pagelogs(  appChannel             string ,   appId             string ,   appPlatform             string ,   appVersion             string ,   createdAtMs             bigint ,   deviceId             string ,   deviceStyle             string ,   nextPage             string ,   osType             string ,   pageId             string ,   pageViewCntInSession             int ,   stayDurationSecs             bigint ,   tenantId             string ,   visitIndex             int )partitioned by (ym int ,day int , hm int) stored as parquet ;[yinzhengjie@s101 download]$
创建日志子表的HQL语句([yinzhengjie@s101 download]$ cat /home/yinzhengjie/download/umeng_create_logs_ddl.sql)
[yinzhengjie@s101 ~]$ hiveSLF4J: Class path contains multiple SLF4J bindings.SLF4J: Found binding in [jar:file:/soft/apache-hive-2.1.1-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/soft/hbase-1.2.6/lib/phoenix-4.10.0-HBase-1.2-client.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/soft/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.Logging initialized using configuration in file:/soft/apache-hive-2.1.1-bin/conf/hive-log4j2.properties Async: trueHive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.hive (default)> show databases;OKdatabase_namedefaultyinzhengjieTime taken: 1.159 seconds, Fetched: 2 row(s)hive (default)> use yinzhengjie;                                                        OKTime taken: 0.055 secondshive (yinzhengjie)> hive (yinzhengjie)> show tables;OKtab_namemyusersraw_logsstudentteacherteacherbakteachercopyTime taken: 0.044 seconds, Fetched: 6 row(s)hive (yinzhengjie)> hive (yinzhengjie)> source /home/yinzhengjie/download/umeng_create_logs_ddl.sql;                #执行在Linux存的HQL语句,这个文件必须真实存在,而且都应该是HQL语句OKTime taken: 0.008 secondsOKTime taken: 0.257 secondsOKTime taken: 0.058 secondsOKTime taken: 0.073 secondsOKTime taken: 0.065 secondsOKTime taken: 0.053 secondshive (yinzhengjie)> show tables;OKtab_nameerrorlogseventlogsmyuserspagelogsraw_logsstartuplogsstudentteacherteacherbakteachercopyusagelogsTime taken: 0.014 seconds, Fetched: 11 row(s)hive (yinzhengjie)>
在hive中执行HQL语句文本文件(hive (yinzhengjie)> source /home/yinzhengjie/download/umeng_create_logs_ddl.sql;)

 

三.

转载于:https://www.cnblogs.com/yinzhengjie/p/9434359.html

你可能感兴趣的文章
战争机器作弊码
查看>>
回车符和换行符
查看>>
arm汇编语言调用C函数之参数传递
查看>>
每一个有理想的程序员都应该读的一本书-《rework》
查看>>
微软Office 365和Windows Azure获入华牌照
查看>>
Caliburn实现MVVM模式的编程
查看>>
Linux查看程序端口占用情况
查看>>
对Static 类的认识
查看>>
net连接access数据库,输出结果到页面
查看>>
Reflection Examples [C#]
查看>>
weblogic11g 节点管理器 nodemanager
查看>>
给ecshop后台增加管理功能页面
查看>>
葵花宝典
查看>>
利用 .NET 框架简化发布和解决 DLL Hell 问题
查看>>
多文档标题乱码
查看>>
xe 心跳
查看>>
C#开发高性能Log Help 类设计开发
查看>>
MVC文件上传05-使用客户端jQuery-File-Upload插件和服务端Backload组件自定义上传文件夹...
查看>>
【整理】待毕业.Net码农就业求职储备
查看>>
Accessibility应用之focus篇
查看>>