I don't think that using Mann-Whitney U test is a good way to do feature selection. Mann-Whitney tests whether distributions of the two variable are the same, it tells you nothing about how correlated the variables are. For example:
>>> from scipy.stats import mannwhitneyu
>>> a = np.arange(100)
>>> b = np.arange(100)
>>> np.random.shuffle(b)
>>> np.corrcoef(a,b)
array([[ 1. , -0.07155116],
[-0.07155116, 1. ]])
>>> mannwhitneyu(a, b)
(5000.0, 0.49951259627554112) # result for almost not correlated
>>> mannwhitneyu(a, a)
(5000.0, 0.49951259627554112) # result for perfectly correlated
Because a and b have the same distributions we fail to reject the null hypothesis that the distributions are identical.
And since in features selection you are trying find features that mostly explain Y, Mann-Whitney U does not help you with that.