WuKong 全文搜索引擎。功能特性:
*
[高效索引和搜索](https://github.com/huichen/wukong/blob/master/docs/benchmarking.md)(1M条微博500M数据28秒索引完,1.65毫秒搜索响应时间,19K搜索QPS)
*
支持中文分词(使用[sego分词包](https://github.com/huichen/sego)并发分词,速度27MB/秒)
*
支持计算关键词在文本中的[紧邻距离](https://github.com/huichen/wukong/blob/master/docs/token_proximity.md)(token proximity)
*
支持计算[BM25相关度](https://github.com/huichen/wukong/blob/master/docs/bm25.md)
*
支持[自定义评分字段和评分规则](https://github.com/huichen/wukong/blob/master/docs/custom_scoring_criteria.md)
*
支持[在线添加、删除索引](https://github.com/huichen/wukong/blob/master/docs/realtime_indexing.md)
*
支持[持久存储](https://github.com/huichen/wukong/blob/master/docs/persistent_storage.md)
*
可实现[分布式索引和搜索](https://github.com/huichen/wukong/blob/master/docs/distributed_indexing_and_search.md)
*
采用对商业应用友好的[Apache License v2](https://github.com/huichen/wukong/blob/master/license.txt)发布
示例代码:
<pre class="brush:ruby;toolbar: true; auto-links: false;">package main
import (
"github.com/huichen/wukong/engine"
"github.com/huichen/wukong/types"
"log"
)
var (
// searcher是协程安全的
searcher = engine.Engine{}
)
func main() {
// 初始化
searcher.Init(types.EngineInitOptions{
SegmenterDictionaries: "github.com/huichen/wukong/data/dictionary.txt"})
defer searcher.Close()
// 将文档加入索引
searcher.IndexDocument(0, types.DocumentIndexData{Content: "此次百度收购将成中国互联网最大并购"})
searcher.IndexDocument(1, types.DocumentIndexData{Content: "百度宣布拟全资收购91无线业务"})
searcher.IndexDocument(2, types.DocumentIndexData{Content: "百度是中国最大的搜索引擎"})
// 等待索引刷新完毕
searcher.FlushIndex()
// 搜索输出格式见types.SearchResponse结构体
log.Print(searcher.Search(types.SearchRequest{Text:"百度中国"}))
}</pre>