I've recently added the Sphider crawler to my site in order to add search functionality. But the default search.php that comes with the distribution of Sphider that I downloaded is too plain and doesn't integrate well with the rest of my site. I have a little navigation bar at the top of the site which has a search box in it, and I'd like to be able to access Sphider's search results through that search field using Ajax. To do this, I figure I need to get Sphider to return its results in JSON format.
The way I did that is I used a "theme" that outputs JSON (Sphider supposts "theming" its output). I found that theme on this thread on Sphider's site. It seems to work, but more strict JSON parsers will not parse it. Here's some example JSON output:
{"result_report":"Displaying results 1 - 1 of 1 match (0 seconds) ", "results":[ { "idented":"false", "num":"1", "weight":"[100.00%]", "link":"http://www.avtainsys.com/articles/Triple_Contraints", "title":"Triple Contraints", "description":" on 01/06/12 Project triple constraints are time, cost, and quality. These are the three constraints that control the performance of the project. Think about this triple-constraint as a three-leg tripod. If one of the legs is elongated or", "link2":"http://www.avtainsys.com/articles/Triple_Contraints", "size":"3.3kb" }, { "num":"-1" } ], "other_pages":[ { "title":"1", "link":"search.php?query=constraints&start=1&search=1&results=10&type=and&domain=", "active":"true" }, ] }
The issue is that there is a trailing comma near the end. According to this, "trailing commas are not allowed" when using PHP's json_decode()
function. This JSON also failed to parse using this online formatter. But when I took the comma out, it worked and I got this better-formatted JSON:
{
"result_report":"Displaying results 1 - 1 of 1 match (0 seconds) ",
"results":[
{
"idented":"false",
"num":"1",
"weight":"[100.00%]",
"link":"http://www.avtainsys.com/articles/Triple_Contraints",
"title":"Triple Contraints",
"description":" on 01/06/12 Project triple constraints are time, cost, and quality. These are the three constraints that control the performance of the project. Think about this triple-constraint as a three-leg tripod. If one of the legs is elongated or",
"link2":"http://www.avtainsys.com/articles/Triple_Contraints",
"size":"3.3kb"
},
{
"num":"-1"
}
],
"other_pages":[
{
"title":"1",
"link":"search.php?query=constraints&start=1&search=1&results=10&type=and&domain=",
"active":"true"
}
]
}
Now, how would I do this programmatically? And (perhaps more importantly), is there a more elegant way of accomplishing this? And you should know that PHP is the only language I can run on my shared hosting account, so a Java solution for example would not work for me.