Elastisearch中提供了river模塊來從其他數據源中獲取數據,該項功能以插件的形式存在,目前已有的river插件包括:
river pluginsedit
1. Supported by Elasticsearch
CouchDB River Plugin
RabbitMQ River Plugin
Twitter River Plugin
Wikipedia River Plugin
2. Supported by the community
ActiveMQ River Plugin (by Dominik Dorn)
Amazon SQS River Plugin (by Alex Bogdanovski)
CSV River Plugin (by Martin Bednar)
Dropbox River Plugin (by David Pilato)
FileSystem River Plugin (by David Pilato)
Git River Plugin (by Olivier Bazoud)
GitHub River Plugin (by uberVU)
Hazelcast River Plugin (by Steve Samuel)
JDBC River Plugin (by J?rg Prante)
JMS River Plugin (by Steve Sarandos)
Kafka River Plugin (by Endgame Inc.)
LDAP River Plugin (by Tanguy Leroux)
MongoDB River Plugin (by Richard Louapre)
Neo4j River Plugin (by Steve Samuel)
Open Archives Initiative (OAI) River Plugin (by J?rg Prante)
Redis River Plugin (by Steve Samuel)
RSS River Plugin (by David Pilato)
Sofa River Plugin (by adamlofts)
Solr River Plugin (by Luca Cavanna)
St9 River Plugin (by Sunny Gleason)
Subversion River Plugin (by Pascal Lombard)
DynamoDB River Plugin (by Kevin Wang)
可以看出,已經覆蓋了大部分的數據源,特別是針對關系型數據庫提供了統一的jdbc-river來進行數據操作。
elasticsearch-river-jdbc的源碼在:github.com/jprante/elasticsearch-river-jdbc,該項目提供了詳細的文檔,下面以SQL Server為例簡單說明使用方法。
首先,需要安裝elasticsearch-river-jdbc,在elasticsearch目錄下執行:
./bin/plugin --install jdbc --url xbib.org/repository/org/xbib/elasticsearch/plugin/elasticsearch-river-jdbc/1.1.0.1/elasticsearch-river-jdbc-1.1.0.1-plugin.zip
然後,安裝SQLServer的JDBC庫,鏈接為: Microsoft JDBC Driver.把其中的 'sqljdbc4.jar'復制到elasticsearch安裝目錄的lib文件夾下。
考慮到elasticsearch集群,以上兩個步驟在每個節點上都需要執行。
最後也是最關鍵的一步,在elasticsearch中建立river,讓elasticsearch自動從SQLServer中獲取數據。
PUT /_river/mytest_river/_meta
{
"type" : "jdbc",
"jdbc" : {
"driver":"com.microsoft.sqlserver.jdbc.SQLServerDriver",
"url":"jdbc:sqlserver://MYSQLSERVERNAME;databaseName=MYProductDatabase",
"user":"admin","password":"Password",
"sql":"select ProductID as _id, CategoryID,ManufacturerID,MfName,ProductTitle,MfgPartNumber from MyProductsTable(nolock)",
"poll":"10m",
"strategy" : "simple",
"index" : "myinventory",
"type" : "product",
"bulk_size" : 100,
"max_retries": 5,
"max_retries_wait":"30s",
"max_bulk_requests" : 5,
"bulk_flush_interval" : "5s"
}
}