I have some reports in html file. I need to place them to excel and make some changes, so I thought I could do those changes beforehand using powershell. Some of the lines are in fixed places, others are not so I need to delete them by making the script recognize a pattern.
Fixed lines starting from top: 12-14,17,19,25-27,30-32,40-42 Fixed lines starting from bottom: 3-13, 48-60
The pattern I need to find and delete, is this:
<td align="center">random string</td>
<td align="left">random string</td>
<td align="left">random string</td>
<td align="left">random string</td>
<td align="right">random string</td>
For the fixed lines I found I can do this:
(gc $maindir\Report23.HTML) | ? {(12..14) -notcontains $_.ReadCount} | out-file $maindir\Report23b.HTML
It works as it deletes the lines 12-14 but I need to put the rest of the fixed line numbers in the same command and I can't seem to figure out how. Also the output file's filesize is twice the original's, which I find weird. I tried using set-content which produces a filesize close to the original but breaks the text encoding in certain parts.
I have no idea how to go about for recognizing the pattern though...