I have a very large set of string URL patterns like {http://www.imdb.com, http://www.amazon.com,...} in a list.
I am getting input URL's like this:
http://www.imdb.com/title/tt1409024/
For the purpose of my application this URL is actually formed from http://www.imdb.com, so the equality of these two should be true
.
To implement this, I can extract the base URL from the input URL:
http://www.imdb.com/title/tt1409024/ => http://www.imdb.com
Now I need to compare this extracted URL with the master list of URLs and store the base URL in a database, if a match is found. So in essence, for each on of my input (base) URL's, I am looking for a match in the master list for the extracted URL, and if a match is found I am storing the input (base) URL in the database.
To implement the equality/matching logic, I have two possible solutions. Please weigh in as to which is better:
- Put the master list of URL's in an array list, and use the array list
contains
method - Put the master list in a database, and use query to check the the input url against it
Can anyone tell me which one will be better in terms of performance?