I am developing a web app in Django. I want to user Lucene as a search engine. However, I want to customize the analyzer to support my purpose. For example, a word "\(H_2\)" should become "H2" before indexing. I am even not sure if this is the job of analyzer.
I have do Google and found these pages useful:
http://packages.python.org/pyes/guide/reference/index-modules/analysis/index.html
http://hi.baidu.com/aruizen/blog/item/7b5fcb2a05ff122cd52af12a.html ('Extending' Java classes from Python)
But I still cannot understand how to customize the StandardAnalyzer in Python. As you can see, the above links give me very different codes.
Thank you!