Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
977 views
in Technique[技术] by (71.8m points)

jdbc - Logstash not reading in new entries from MySQL

I have Logstash and Elasticsearch installed locally on my Windows 7 machine. I installed logstash-input-jdbc in Logstash.

I have data in MySql database which I send to Elasticsearch using Logstash so I can do some report generating.

Logstash config file that does this.

input {
 jdbc {
   jdbc_driver_library => "C:/logstash/lib/mysql-connector-java-5.1.37-bin.jar"
   jdbc_driver_class => "com.mysql.jdbc.Driver"
   jdbc_connection_string => "jdbc:mysql://127.0.0.1:3306/test"
   jdbc_user => "root"
   jdbc_password => ""
   statement => "SELECT * FROM transport.audit"
   jdbc_paging_enabled => "true"
   jdbc_page_size => "50000"
}
}

output {
  elasticsearch {
    hosts => ["localhost:9200"]
    index => "transport-audit-%{+YYYY.mm.dd}"
}
}

this works and Logstash sends the data to Elasticsearch when I run :

binlogstash agent -f logstashconf1_input.conf

this is the response from that command

io/console not supported; tty will not be manipulated
Default settings used: Filter workers: 4
Logstash startup completed
Logstash shutdown completed

WHY, does Logstash shutdown?

when I check Elasticsearch the data is there, and if I run the command again the data is re-indexed (duplicated).

enter image description here

Here is the Mysql data:

enter image description here

What I am trying to do (achieve):

I want Logstash to run and listen for new entries on audit table and only index that data (when a new audit entry is entered into the table Logstash would know and send that entry to Elasticsearch.

Also why does Logstash stop when I run that command, should it not be running? I am new to Logstash and Elasticsearch.

Thanks

G

I have also posted the same question in Elastic forum, and if I get the answer I will post here to help others.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

By default, the logstash-input-jdbc plugin will run your SELECT statement once and then quit. You can change this behavior by adding a schedule parameter with a cron expression to your configuration, like this:

input {
 jdbc {
   jdbc_driver_library => "C:/logstash/lib/mysql-connector-java-5.1.37-bin.jar"
   jdbc_driver_class => "com.mysql.jdbc.Driver"
   jdbc_connection_string => "jdbc:mysql://127.0.0.1:3306/test"
   jdbc_user => "root"
   jdbc_password => ""
   statement => "SELECT * FROM transport.audit"
   schedule => "* * * * *"               <----- add this line
   jdbc_paging_enabled => "true"
   jdbc_page_size => "50000"
 }
}

The result is that the SELECT statement will now run every minute.

If you had a date field in your MySQL table (but it doesn't seem the case), you could also use the pre-defined sql_last_start parameter in order to not re-index all records on every run. That parameter can be used in your query like this:

   statement => "SELECT * FROM transport.audit WHERE your_date_field >= :sql_last_start"

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...