ruby - URL の最初のスラッシュの前にあるものをすべて削除しますか?

Question

/正規表現を使用して、URLの最初のパスの前にあるものをすべて削除するにはどうすればよいですか?

URL の例:https://www.example.com/some/page?user=1&email=joe@schmoe.org

それから、私はただ欲しい/some/page?user=1&email=joe@schmoe.org

単なるルートドメイン (つまり ) の場合は、が返されるようにするhttps://www.example.com/だけです。/

ドメインにはサブドメインがある場合とない場合があり、安全なプロトコルがある場合とない場合があります。本当に最終的には、最初のパススラッシュの前に何かを取り除きたいだけです。

念のため、Ruby 1.9.3 を実行しています。

score 13 · Accepted Answer

これには正規表現を使用しないでください。URIクラスを使用します。あなたは書ける：

require 'uri'

u = URI.parse('https://www.example.com/some/page?user=1&email=joe@schmoe.org')
u.path #=> "/some/page"
u.query #=> "user=1&email=joe@schmoe.org"

# All together - this will only return path if query is empty (no ?)
u.request_uri #=> "/some/page?user=1&email=joe@schmoe.org"

score 5 · Accepted Answer

 require 'uri'

 uri = URI.parse("https://www.example.com/some/page?user=1&email=joe@schmoe.org")

 > uri.path + '?' + uri.query
  => "/some/page?user=1&email=joe@schmoe.org"

Gavin も述べたように、魅力的ではありますが、これに RegExp を使用することはお勧めできません。RegExp を作成したときには予期していなかった特殊文字 (UniCode 文字も含む) を含む URL を使用できます。これは、特にクエリ文字列で発生する可能性があります。URI ライブラリを使用する方が安全な方法です。

score 0 · Accepted Answer

を使用して同じことができますString#index

index(部分文字列[, オフセット])

str = "https://www.example.com/some/page?user=1&email=joe@schmoe.org"
offset = str.index("//") # => 6
str[str.index('/',offset + 2)..-1]
# => "/some/page?user=1&email=joe@schmoe.org"

score 0 · Accepted Answer

この場合は URI モジュールを使用するというアドバイスに強く同意します。また、正規表現が得意だとは思いません。それでも、あなたが求めていることを実行する 1 つの可能な方法を示すことは価値があるようです。

test_url1 = 'https://www.example.com/some/page?user=1&email=joe@schmoe.org'
test_url2 = 'http://test.com/'
test_url3 = 'http://test.com'

regex = /^https?:\/\/[^\/]+(.*)/

regex.match(test_url1)[1]
# => "/some/page?user=1&email=joe@schmoe.org"

regex.match(test_url2)[1]
# => "/"

regex.match(test_url3)[1]
# => ""

最後のケースでは、URL に末尾がない'/'ため、結果は空の文字列になることに注意してください。

正規表現 ( /^https?:\/\/[^\/]+(.*)/) は、文字列が ( ^) http( http) で始まり、オプションで( ) が続き、s( ) が続き、その後に少なくとも 1 つの非スラッシュ文字 ( ) が続き、その後に 0 個以上の文字が続き、それらをキャプチャしたいことを示します。文字 ( )。s?://:\/\/[^\/]+(.*)

この例と説明が参考になることを願っています。この場合、正規表現を実際に使用しないことを再度お勧めします。URI モジュールは使い方が簡単で、はるかに堅牢です。

ruby - URL の最初のスラッシュの前にあるものをすべて削除しますか?

4 に答える 4

Related

Reference