Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
399 views
in Technique[技术] by (71.8m points)

Elasticsearch使用JDBC与MySQL数据同步,出现 一直重复数据的问题.

每次后台去运行.sh文件运行一次ES中就会把之前的数据重复一次,并不是覆盖,自己看了一下应该是时间戳的问题,因为.sh里"schedule" 字段是格式化的时间,而我的数据库里存的是时间戳的.

#bin/sh
bin=$JDBC_IMPORTER_HOME/bin
lib=$JDBC_IMPORTER_HOME/lib
echo '
{
    "type" : "jdbc",
    "jdbc" : {
        "url" : "jdbc:mysql://localhost:3306/wenda",
        "user" : "root",
        "password" : "123456",
        "statefile" : "statefile-ask.json",
        "schedule" : "0 0-59 0-23 ? * *", #应该是这里的问题
        "sql" : [
            {
                "statement" : "select * from think_ask where create_time > ?",
                "parameter" : [ "$metrics.lastexecutionstart" ]
            }
        ],
        "index" : "myindex",
        "type" : "mytype",
        "index_settings" : {
            "analysis" : {
            "analyzer" : {
                "ik" : {
                    "tokenizer" : "ik"
                }
            }
        }
        },
        "type_mapping": {
            "ask" : {
                "properties" : {
                    "asid" : {
                        "type" : "long"
                    },
                    "content" : {
                        "type" : "string",
                        "analyzer" : "ik",
                        "index" : "not_analyzed",
                        "searchAnalyzer": "pinyin_analyzer"
                    },
                    "uname" : {
                        "type" : "string",
                        "analyzer" : "ik"
                    },
                    "click_count" : {
                        "type" : "long"
                    },
                    "time" : {
                        "type" : "long"
                    }
                }
            }
        }
    }
}
' | java 
    -cp "${lib}/*" 
    -Dlog4j.configurationFile=${bin}/log4j2.xml 
    org.xbib.tools.Runner 
    org.xbib.tools.JDBCImporter

现在设置的是每分钟执行一次这个sh文件,每执行一次数据都会上传一次,重复数据,不管有没有更新数据都会上传,请问这个时间改怎么改合适?


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

你在"statement"处写sql时就可以格式化时间你想要的时间格式
例如

SELECT FROM_UNIXTIME(createtime, '%Y-%m-%d %H:%i:%s')  AS dateline, * FROM  think_ask WHERE createtime>?

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...