3

デモ PDF で IBM Watson Document Conversion サービスを使用しようとしましたが、ドキュメントが少しずつ変換されません。それがしているのは、1つの回答ユニットを作成することだけです。それは非常に長いです:

"text": "Watson is an artificially intelligent computer system capable of answering questions posed in natural language,[2] developed in IBM's DeepQA project by a research team led by principal investigator David Ferrucci. Watson was named after IBM's first CEO and industrialist Thomas J. Watson.[3][4] The computer system was specifically developed to answer questions on the quiz show Jeopardy![5] In 2011, Watson competed on Jeopardy! against former winners Brad Rutter and Ken Jennings.[3][6] Watson received the first place prize of $1 million.[7] Watson had access to 200 million pages of structured and unstructured content consuming four terabytes of disk storage[8] including the full text of Wikipedia,[9] but was not connected to the Internet during the game.[10][11] For each clue, Watson's three most probable responses were displayed on the television screen. Watson consistently outperformed its human opponents on the game's signaling device, but had trouble responding to a few categories, notably those having short clues containing only a few words. In February 2013, IBM announced that Watson software system's first commercial application would be for utilization management decisions in lung cancer treatment at Memorial Sloan- Kettering Cancer Center in conjunction with health insurance company WellPoint.[12] IBM Watson's former business chief Manoj Saxena says that 90% of nurses in the field who use Watson now follow its guidance.[13]"

前もって感謝します!

4

1 に答える 1

6

残念ながら、そのデモ PDF は使用するのに最適なドキュメントではありません。現在、回答ユニットは見出しタグ (h1 ~ h6) に基づいて分割されており、その PDF にはヘッダーが含まれていません。=(

を に設定するconversion_targetNORMALIZED_HTML、変換された PDF を Answer Unit に分割する前に表示できます。段落は含まれますが、見出しはありません。

将来的には、Answer Unit を段落ごとに分割できるようにする予定ですが、まだリリースされていません。

更新: デモ サイトの PDF を更新して、より良い例を示しました。

于 2015-12-03T20:38:43.707 に答える