0

I have been trying to find how I can query the first sentence only of a paragraph within a field (HTML code) for SQL Server but I am unable to find how. I have found solutions for MySQL using the SUBSTRING_INDEX and I have also found solutions returning a certain number of words/characters but not using a specific delimiter.

My field is stored as HTML, an example is as follows:

<html><body>Enter the following page information.<br><br>
<b>Display #:</b> 1 [Automatically Populated]<br>
<b>Start Page: </b> 1 [Automatically Populated]<br>
<b>DCI Name:</b>  DEMOG<br>
<b>Clinical Planned Event:</b>  BASELINE1<br>
<font color="#0070C0">TAKE A SCREENSHOT</font>.<br>
</body></html>

In this example, I am hoping to only return/query "Enter the following page information" and not the rest of the paragraph. I'm assuming the HTML break might be the best delimiter as some sentences may end in a colon.

Thank you in advance! I hope I explained the scenario well enough.

4

1 に答える 1

0

I realize this is ugly as sin, but assuming that the first <br> is the end of the line, this should work in the SQL Server back-end:

DECLARE @x nvarchar(200)
SET @x = '<html><body>Enter the following page information.<br><br><b>Display #:</b>'

SELECT substring(@x,

    (charindex('<br>', lower(@x)) - 1) - 
        (charindex('>', REVERSE(LEFT(@x, charindex('<br>', lower(@x)) - 1))))+2,

    charindex('>', REVERSE(LEFT(@x, charindex('<br>', lower(@x)) - 1))) - 1

)

Basically, we find the last instance of > in the string before the first <br>, and then find the <br> at the end, and take the difference between the two for the length.

This could absolutely be written cleaner in a function, but I opted to go with pure T-SQL to avoid using functions.

A final note: You may not need the lower functions; my test database is case-sensitive, therefore the need to make the casing consistent.

于 2012-10-01T19:08:59.783 に答える