elasticsearch-river-jdbc在mysql数据库中插入重复logging

对不起，我是Elasticsearch的新手，我正在使用elasticsearch-river-jdbc连接到myql数据库，一切都工作正常，除了每次运行按照计划插入重复logging这一事实。这就是我正在使用

curl -XPUT 'localhost:9200/_river/my_jdbc_river/_meta' -d '{ "type" : "jdbc", "schedule" : "0 0-59 0-23 ? * *", "jdbc" : { "url" : "jdbc:mysql://localhost:3306/test", "user" : "test", "password" : "test", "sql" : "select * from test" } }'

我经历了一些文件，其中提到，我们可以运行SQL查询select基于_id，但我的问题是当我们创build这条河，只有这个唯一的身份证创build，并在Elasticsearch一边创build，以我的理解MySQL没有关于这方面的知识。请让我知道如果我失去了一些东西

所以，如果我正在写这样的sql声明

  "sql" : "select id as _id,a1,a2 from test" [2015-03-10 13:16:00,018][ERROR][river.jdbc.RiverPipeline ] com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Unknown column 'id' in 'field list' java.io.IOException: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Unknown column 'id' in 'field list'

解决此问题的方法是，我需要select其中一个字段作为“_id”才能工作

  "sql" : "select *, revision as _id from test;"

现在的另一个问题是当它的数据写回到ES的数据和时间格式变成UTC

 for eg: 2015-03-11T00:00:00.000-07:00 and 1970-01-01T10:55:54.000-08:00

已经有与此有关的线程，但没有解决方法

https://stackoverflow.com/questions/12969481/jprante-elasticsearch-jdbc-river-changing-the-date-value

此问题的解决scheme是在jdbc块中使用时区

 "timezone" : "TimeZone.getDefault()"

此外，我在日志和时间在MySQL数据库单独的字段

 | date | date | YES | | NULL | | | time | time | YES | | NULL | |

Elasticsearch使用Joda时间格式来保存date。因此它会自动将我的date转换为date时间。

在date字段中，由于我没有时间，它会自动添加零。

因为我需要通过Kibana显示数据，为什么我需要这个split..Idate和时间的格式转换为varchar（20）作为解决方法（坏主意，我知道），现在工作正常..