java - Java データ抽出の正規表現

Question

文字列からデータを抽出したい。このために、パターンコンパイルとマッチクラスを使用します。しかし、次の文字列の正規表現を開発するのに苦労しています。

"<WebApicall id="4" time="2013-10-05; 22:44:18" timeStamp="|18|44|22|5|9|113|6|277|0|" tick="11589293" file="self" bdlLine="61" type="url" url="http://www.google.com/"> WebUrl </WebApicall>"

上記のリンクから、データ 4、2013 年 10 月 5 日が必要です。22:44:18 などです。どうすれば正規表現を作成できますか。どんな助けでも大歓迎です。

score 0 · Accepted Answer

引用符内のすべてを取得しようとしている場合は、次のようなものを使用できます。"([^"]+)"

ただし、この正規表現には落とし穴がありますが、ニーズをより明確に指定しない限り、これを行う必要があります。

デモ: http://regex101.com/r/qJ6jY8

score 0 · Accepted Answer

html/xml の解析には jsoup を使用する必要があります。セレクターを使用できるため、必要なものを正確に取得できます。正規表現を使用する必要がある場合は、 Matcher.

Matcher m = Pattern.compile("id=\"(.*)\" time=\"(.*) \"tick" ).matcher(myXmlString);

List<String> matches = new ArrayList<String>();
while (m.find()) {
    matches.add(m.group(1));
    matches.add(m.group(2));
}

score 0 · Accepted Answer

正規表現は次のとおりです。

^<WebApicall\s+id=\"(\d+)\"\s+time=\"(.*)\"\s+timeStamp=\"(\|?\d+\|)+\"\s+tick=\"(\d+)\".*url=\"(.*)\">

そして、ここにそれがどのように使用できるかを示すJavaスニペットがあります:

import java.util.regex.Matcher;
import java.util.regex.Pattern;

 ...

String id;
String time;
String timeStamp;
String tick;
String url;

 ...

String textual = "<WebApicall id="4" time="2013-10-05; 22:44:18" timeStamp="|18|44|22|5|9|113|6|277|0|" tick="11589293" file="self" bdlLine="61" type="url" url="http://www.google.com/"> WebUrl </WebApicall>";
String regex = "^<WebApicall\\s+id=\\\"(\\d+)\\\"\\s+time=\\\"(.*)\\\"\\s+timeStamp=\\\"(\\|?\\d+\\|)+\\\"\\s+tick=\\\"(\\d+)\\\".*url=\\\"(.*)\\\">";
Matcher m = Pattern.compile(regex).matcher(textual);
if (m.matches()) {
  id = m.group(1);
  time = m.group(2);
  timeStamp = m.group(3);
  tick = m.group(4);
  url = m.group(5);
   ...
}
 ...

java - Java データ抽出の正規表現

3 に答える 3

Related

Reference