1

特定の単語 (ユーザーが提供) と単語の両側にある 7 つの単語を抽出するための優れたメカニズムを見つける必要があります。たとえば、次のテキストがある場合

text = "The mean distance of the Sun from the Earth is approximately 149.6 million kilometers (1 AU), though the distance varies as the Earth moves from perihelion in January to aphelion in July"

ユーザーが「Earth」という単語を入力すると、テキストの次の部分を抽出できるはずです

mean distance of the Sun from the Earth is approximately 149.6 million kilometers (1 AU)

ご覧のように、「Earth」という単語が両側に 7 つの単語で囲まれています。Javaでこれを行うにはどうすればよいですか?

4

2 に答える 2

3

([^ ]+ ?)単語を一致させ([^ ]+ ?){0,7}、キーワードを取得するために使用します。

String text = "The mean distance of the Sun from the Earth is approximately 149.6 million kilometers (1 AU), though the distance varies as the Earth moves from perihelion in January to aphelion in July";
String word = "Earth";
int around=7;
String pattern="([^ ]+ ?){0,"+around+"}"+word+"( ?[^ ]+){0,"+around+"}";        
if(pattern!=null){
    Matcher m = Pattern.compile(pattern).matcher(text);
    if(m.find()){
        System.out.println(m.group());
    }
}
于 2012-10-01T00:17:22.237 に答える
1
public static void print() throws Exception {

    String s = "The mean distance of the Sun from the Earth is approximately 149.6 million kilometers (1 AU), though the distance varies as the Earth moves from perihelion in January to aphelion in July";
    int presize = 7;
    int postsize = 7;

    String term = "Earth";
    String[] flds = s.split("[\\s]+");

    int idx = 0;
    for (idx = 0; idx < flds.length && !flds[idx].equals(term); idx++) 
        ;

    if (idx == flds.length)
        throw new Exception("Term not found");

    int start = idx-presize;
    if (start < 0)
        start = 0;
    int end = idx+postsize;
    if (end >= flds.length)
        end = flds.length-1;
    for (int i = start; i <= end; i++) {
        System.out.print(flds[i] + " ");
    }
}
于 2012-10-01T00:13:39.303 に答える