2

everyone.

I'm trying to understand "object detection with discriminatively trained part based models".

Since objects, like hands, exhibit significant viewpoint variation, the authors develops mixture of models to handle this problem. Please see figure 1 as an illustration. enter image description here

Figure 1. Detections obtained with a 2 component bicycle model. These examples illustrate the importance of deformations mixture models. In this model the first component captures sideways views of bicycles while the second component frontal and near frontal views. The sideways component can deform to match a "wheelie".

I have some doubts for how to use this mixture of models to detect objects. The related description is at section 3.3. The author said: "To detect objects using a mixture model we use the matching algorithm described above to find root locations that yield high scoring hypotheses independently for each component." Does it mean using every component to find the objects independently? Taking the 2 component bicycle model as an example, we'll use every component to detect bicycles in the image independently?

For the training part of the paper, according to my understanding, the training data consists of images with labeled bounding boxes. Please see the figure 2. enter image description here Figure 2, the left side is examples of training image. The right side is the result of training, including model structure, filters and deformation costs. My question is the number of deformable part of the model is achieved by training or set advanced? For example, the person model in figure 2 has 5 deformable parts. the number 5 is got by training or not?

I also tried to find answer by reading authors' source code. I've checked all the trained models, like car, person, bird, bottle, and etc. In these files, there is a field named "filters", which is a structure array with the size 1 by 54. According to my understanding, these 54 filters are root filters and part filters. Am I right? I'm confused by the number of 54? How to get this number? which one is the root filter and which one is the part filter in these 54 filters?

I'll give a presentation about your paper this Friday. If possible, can you do me a favor to answer this email before that day? I know this request is kind of rude. :-). Anyway, thanks so much.

4

0 に答える 0