regex - シェルスクリプト - ファイルの一覧表示、ファイルの読み取り、新しいファイルへのデータの書き込み

Question

シェルスクリプトについて特別な質問があります。
簡単なスクリプトは私にとって問題ではありませんが、これは初めてで、簡単なデータベースファイルを作成したいと考えています。

だから、私がやりたいことは次のとおりです。

- Search for filetypes (i.e. .nfo) <-- should be no problem :)
- read inside of each found file and use some strings inside
- these string of each file should be written in a new file. Each found file informations

新しいファイルでは 1 行にする必要があります

私の「プロジェクト」をうまく説明できたと思います。

私の問題は、ファイルを検索し、この各ファイルを使用して読み取り、その中の情報を使用してこれを新しいファイルに書き込む必要があることをスクリプトに伝える方法を理解することです。

もう少し詳しく説明します。
ファイルを検索していますが、次のように返されます。

file1.nfo
file2.nfo
file3.nfo

さて、そのファイルのそれぞれで、2 行の間に情報が必要です。つまり
、file1.nfo:

<user>test1</user>

file2.info:

<user>test2</user>

したがって、新しいファイルには次のようになります。

file1.nfo:user1
file2.nfo:user2

わかりました：

find -name *.nfo  > /test/database.txt

ファイルのリストを出力しています。と

sed -n '/<user*/,/<\/user>/p' file1.nfo

<user>との間の情報だけでなく、完全なファイルを返します</user>

一歩一歩進んで行こうとして、たくさん読んでいますが、とても難しそうです。

すべてのファイルを一覧表示し、ファイルと 2 つの文字列間のコンテンツをファイルに書き込むための最良の方法は何ですか?

編集-新規:

詳細については、こちらの更新をご覧ください。私は今、多くのことを学び、自分の問題をウェブで検索しました。たくさんの情報を見つけることができますが、それらをまとめて使用できるようにする方法がわかりません。

awk を使用して現在作業しているのは、ファイル名と文字列を取得することです。

ここに完全な情報があります（少しの助けを借りて自分で続けることができると思っていましたが、できません:( ）

/test/file1.nfo の例を次に示します。

<string1>STRING 1</string1>
<string2>STRING 2</string2>
<string3>STRING 3</string3>
<string4>STRING 4</string4>
<personal informations>
<hobby>Baseball</hobby>
<hobby>Baskeball</hobby>
</personal informations>

/test/file2.nof の例

<string1>STRING 1</string1>
<string2>STRING 2</string2>
<string3>STRING 3</string3>
<string4>STRING 4</string4>
<personal informations>
<hobby>Soccer</hobby>
<hobby>Traveling</hobby>
</personal informations>

作成したいファイルは、次のようにする必要があります。

STRING 1:::/test/file1.nfo:::Date of file:::STRING 4:::STRING 3:::Baseball, Basketball:::STRING 2
STRING 1:::/test/file2.nfo:::Date of file:::STRING 4:::STRING 3:::Baseball, Basketball:::STRING 2

「ファイルの日付」は、ファイルの作成日である必要があります。そのため、ファイルの古さを確認できます。

だから、それは私が必要とするものであり、それは簡単ではないようです.

どうもありがとう。

更新エラー -printf

find: unrecognized: -printf

Usage: find [PATH]... [OPTIONS] [ACTIONS]

Search for files and perform actions on them.
First failed action stops processing of current file.
Defaults: PATH is current directory, action is '-print'

    -follow         Follow symlinks
    -xdev           Don't descend directories on other filesystems
    -maxdepth N     Descend at most N levels. -maxdepth 0 applies
                    actions to command line arguments only
    -mindepth N     Don't act on first N levels
    -depth          Act on directory *after* traversing it

Actions:
    ( ACTIONS )     Group actions for -o / -a
    ! ACT           Invert ACT's success/failure
    ACT1 [-a] ACT2  If ACT1 fails, stop, else do ACT2
    ACT1 -o ACT2    If ACT1 succeeds, stop, else do ACT2
                    Note: -a has higher priority than -o
    -name PATTERN   Match file name (w/o directory name) to PATTERN
    -iname PATTERN  Case insensitive -name
    -path PATTERN   Match path to PATTERN
    -ipath PATTERN  Case insensitive -path
    -regex PATTERN  Match path to regex PATTERN
    -type X         File type is X (one of: f,d,l,b,c,...)
    -perm MASK      At least one mask bit (+MASK), all bits (-MASK),
                    or exactly MASK bits are set in file's mode
    -mtime DAYS     mtime is greater than (+N), less than (-N),
                    or exactly N days in the past
    -mmin MINS      mtime is greater than (+N), less than (-N),
                    or exactly N minutes in the past
    -newer FILE     mtime is more recent than FILE's
    -inum N         File has inode number N
    -user NAME/ID   File is owned by given user
    -group NAME/ID  File is owned by given group
    -size N[bck]    File size is N (c:bytes,k:kbytes,b:512 bytes(def.))
                    +/-N: file size is bigger/smaller than N
    -links N        Number of links is greater than (+N), less than (-N),
                    or exactly N
    -prune          If current file is directory, don't descend into it
If none of the following actions is specified, -print is assumed
    -print          Print file name
    -print0         Print file name, NUL terminated
    -exec CMD ARG ; Run CMD with all instances of {} replaced by
                    file name. Fails if CMD exits with nonzero
    -delete         Delete current file/directory. Turns on -depth option

score 2 · Accepted Answer

sedのpat1,pat2表記は行単位です。このように考えpat1て、コマンドの有効フラグを設定し、フラグをpat2無効にします。pat1との両方pat2が同じ行にある場合、フラグが設定されるため、この場合、その<user>行以降のすべてを出力します。詳細については、 grymoire の sed ハウツーを参照してください。

この場合、sed の代わりに、GNU grep などのルックアラウンドアサーションをサポートする grep を使用することができます。

find . -type f -name '*.nfo' | xargs grep -oP '(?<=<user>).*(?=</user>)'

grep がをサポートしていない場合は-P、grep と sed を組み合わせて使用できます。

find . -type f -name '*.nfo' | xargs grep -o '<user>.*</user>' | sed 's:</\?user>::g'

出力：

./file1.nfo:test1
./file2.nfo:test2

にファイルを渡すことに関連する問題xargs-exec ...を認識し、代わりに使用する必要があることに注意してください。

score 1 · Accepted Answer

あなたに必要なのは：

find -name '*.nfo' | xargs awk -F'[><]' '{print FILENAME,$3}'

サンプル入力に表示されている以外のものがファイルにある場合は、おそらくこれで十分です。

... awk -F'[><]' '/<user>/{print FILENAME,$3}' file

これを試してください（テストされていません）：

> outfile
find -name '*.nfo' -printf "%p %Tc\n" |
while IFS= read -r fname tstamp
do
      awk -v tstamp="$tstamp" -F'[><]' -v OFS=":::" '
          { a[$2] = a[$2] sep[$2] $3; sep[$2] = ", " }
          END {
              print a["string1"], FILENAME, tstamp, a["string4"], a["string3"], a["hobby"], a["string2"]
          }
      ' "$fname" >> outfile
done

上記は、ファイル名にスペースが含まれていない場合にのみ機能します。可能であれば、ループを微調整する必要があります。

find が -printf をサポートしていない場合の代替手段 (提案 - 最新の "find" を取得することを真剣に検討してください!):

> outfile
find -name '*.nfo' -print |
while IFS= read -r fname
do
      tstamp=$(stat -c"%x" "$fname")
      awk -v tstamp="$tstamp" -F'[><]' -v OFS=":::" '
          { a[$2] = a[$2] sep[$2] $3; sep[$2] = ", " }
          END {
              print a["string1"], FILENAME, tstamp, a["string4"], a["string3"], a["hobby"], a["string2"]
          }
      ' "$fname" >> outfile
done

「stat」がない場合は、ファイルからタイムスタンプを取得するための代替手段をグーグルで検索するか、出力の解析を検討してls -lください-信頼性はありませんが、それだけの場合...

regex - シェル スクリプト - ファイルの一覧表示、ファイルの読み取り、新しいファイルへのデータの書き込み

3 に答える 3

Related

Reference

regex - シェルスクリプト - ファイルの一覧表示、ファイルの読み取り、新しいファイルへのデータの書き込み