Ruby on Rails 中文 Wiki
acts_as_ferret

使用 acts_as_ferret 实现全文检索

关于 acts_as_ferret

acts_as_ferret 是一个Ruby on Rails的插件用于方便地实现全文检索功能。这个插件是建立在Ferret 的基础上。Ferret是Apache Lucene 全文检索引擎在Ruby上的移植。acts_as_ferret适用于几乎任何需要实现全文检索功能的程序。

以下是 tim.teng提供的 acts_as_ferret 使用示例

下载ferret.gem , acts_as_ferret plugin

gem : gem insatall ferret
plugin : svn://projects.jkraemer.net/acts_as_ferret/trunk/plugin/acts_as_ferret,直接check out下来放到vendor/plugin 目录下即可

使用 acts_as_ferret

现有一个model:

class Provider< ActiveRecord::Base
....... ...................
acts_as_ferret :fields => [:name,:description,:address,:website,:score,:phone,:direction]
.......... ............
end

这里:fields就是要进行索引的对象属性了,当然你也可以不指定,直接就一句”acts_as_ferret”,默认不指定的时候是对所有列进行索引,先时事效果:

>>ruby script/console
>>Provider.find_by_contents(" 你要 搜索 的内容")


OK,如果没有意外的话, 结果会返回一个ActsAsFerret::SearchResults对象。如果出错了,回过去检查一下,短短几步,检查起来也应该很方便。

几个提示:

  1. “关键词1” AND “关键词2”Provider.find_by_contents(” 关键词1 关键词2”)
  2. “关键词1” OR “关键词2” Provider.find_by_contents(“关键词1 OR 关键词2”)
  3. 貌似“关键词1”的结果 Provider.find_by_contents(“关键词1~”)

上面这几个都是小儿科啦

下面我们到acts_as_ferret plugin内部去看看,
acts_as_ferret/lib 瞄到 acts_as_ferret.rb 第116行(VERSION=181)
“ActiveRecord::Base.extend ActsAsFerret::ActMethods”
许多blog中都建议在要进行分词查询的model中加入下面一段代码:

def full_text_search(q, options = {})
  return nil if q.nil? or q.strip.blank?
  default_options = {:limit => 10, :page => 1}
  options = default_options.merge options
  options[:offset] = options[:limit] * (options.delete(:page).to_i-1)
  results = self.find_by_contents(q, options)
  [results.total_hits, results]
end

不过,本着DRY的原则,我决定从“ActiveRecord::Base.extend ActsAsFerret::ActMethods”这句话上做文章,既然ActiveRecord::Base 继承了 Acts As Ferret?::Act Methods? ,何不将这段代码加入到ActMethods里去,让所有model都拥有 full_text_search的功能?

module ActsAsFerret #:nodoc:

  # This module defines the acts_as_ferret method and is included into
  # ActiveRecord::Base
  module ActMethods

    def full_text_search(q, options = {})
      return nil if q.nil? or q.strip.blank?
      default_options = {:limit => 10, :page => 1}
      options = default_options.merge options
      options[:offset] = options[:limit] * (options.delete(:page).to_i-1)
      results = self.find_by_contents(q, options)
      [results.total_hits, results]
   end
................ ....................
  end
end

重启一下,Provider.full_text_search方法应该出来了,有问题的话自己回去检查看看
下面就是俗套,在application.rb中加入:

def pages_for(size, options = {})
  default_options = {:per_page => 10}
  options = default_options.merge options
  pages = Paginator.new self, size, options[:per_page],   (params[:page]||1)
  pages
end

这些种种工作都是为了在controller以及view中调用:
controller:

def search @query=params[:query]
  @total, @providers = Provider.full_text_search(@query, :page => (params[:page]||1))
  @pages = pages_for(@total)
end

view:


<%= pagination_links(@pages, :params => {:query=>@query}) %>

关于 find_by_contents(q, options = {}, find_options = {})
中的find_options,考虑一下一种情况:
provider表结构:
id name description category_id

如果要对特定category_id的数据进行检索,应如何?这里就用到了find_option了:


Vendor.find_by_contents(
  " 要进行检索的关键字 ",
  {:limit=>30},
  {
    :joins=>"providers join categories on providers.category_id = categories.id ", 
    :conditions =>" providers.category_id = XXXXX " 
  }
)

其他一些小技巧大家可以到到网上找找,这里就不一一列举了。


acts_as_ferret :fields => {
  :name => {},
  :address => {},
  :website => {},
  :score => {:index=> :untokenized, :term_vector => :no ,:boost=>10},
  :vote_count => {:index=> :untokenized, :term_vector => :no,:boost=>5},
  :category_name=>{},
  :tag_name=>{},
  :taste => {:index => :untokenized},
  :service => {:index => :untokenized},
  :environment => {:index => :untokenized},
  :deal => {:index => :untokenized},
}
本条目被以下条目链接: