ruby - テキストで頭字語を見つけるにはどうすればよいですか?

Question

私のプロジェクトは多くのファイル (これらのファイルにはタイトルテキストとセクションがあります) を読み取り、頭字語を含むファイルのタイトルを見つける必要があります。これは私のドキュメントクラスです:

class Doc
  def initialize(id, secciones)
    @id, @secciones = id, secciones
  end
  def to_s
    result = "" + @id.to_s + "\n" + @secciones.to_s
    return result
  end
  def tiene_acronimo(acr)
    puts "a ver si tiene acronimos el docu.."
    tiene_acronimo = false
    secciones.each do |seccion|
      if seccion.tiene_acronimo(acr)
        tiene_acronimo = true
      end
    end
    return tiene_acronimo
  end
  attr_accessor :id
  attr_accessor :secciones
end

そして、これは私のセクションクラスです:

class Section
  def initialize ()
    @title = ""
    @text = ""   
  end
  def tiene_acronimo(acr)
    return title.include?(acr) || text.include?(acr)
  end
end

そして、これが私の方法です：

def test()
  results = Array.new
  puts "Dame el acronimo"
  acr = gets
  documentos_cientificos.each do |d|
  if d.tiene_acronimo(acr)
    results << d
  end  
end

このメソッドは頭字語を取得し、それを含むすべてのドキュメントを検索する必要があります。メソッドinclue?[sic] は大文字を無視しtrue、ドキュメントに頭字語のような部分文字列が含まれている場合に返します。例えば：

Multiple sclerosis (**MS**), also known as # => `true`
Presenting signs and sympto**ms** # => `false` (but `include?` returns `true`)

頭字語をより簡単に見つけるにはどうすればよいですか?

score 1 · Accepted Answer

match 関数で正規表現を使用できます。次の正規表現は、提供された完全な単語がコンテンツに含まれている場合に一致を検出します。部分文字列は無視され、大文字と小文字が区別されます。

arc = "MS"
title = "Multiple sclerosis (MS), also known as"
text = "Presenting signs and symptoms"

title.match(/\b#{Regexp.escape(acr)}\b/) # => #<MatchData "MS">
text.match(/\b#{Regexp.escape(acr)}\b/) # => nil

または同等に

title.match(/\b#{Regexp.escape(acr)}\b/).to_a.size > 0 # => true
text.match(/\b#{Regexp.escape(acr)}\b/).to_a.size > 0 # => false

...したがって、関数を次のように再定義できます。

def tiene_acronimo(acr)
  regex_to_match = /\b#{Regexp.escape(acr)}\b/
  has_acr = false
  if (title.match(regex_to_match)) || (text.match(regex_to_match))
    has_acr = true
  end

  return has_acr
end

ruby - テキストで頭字語を見つけるにはどうすればよいですか?

1 に答える 1

Related

Reference