PhpSpreadsheet/tests/PhpSpreadsheetTests/Reader/Html/HtmlLoadStringTest.php
Owen Leibman 6080c4561d Improve Coverage for HTML Reader
Reader/Html is now covered except for 1 statement.
There is some coverage of RichText when you know in advance that the
html will expand into a single cell.
It is a tougher nut, one that I have not yet cracked,
to try to handle rich text while converting unkown html to multiple cells.
The original author left this as a TODO, and so for now must I.

It made sense to restructure some of the code. There are some changes.
- Issue #1532 is fixed (links are now saved when using rowspan).
- Colors can now be specified as html color name. To accomplish this,
  Helper/Html function colourNameLookup was changed from protected
  to public, and changed to static.
- Superfluous empty lines were eliminated in a number of places, e.g.
  <ul><li>A</li><li>B</li><li>C</li></ul>
  had formerly caused a wrapped cell to be created with 2 empty lines
  followed by A, B, and C on separate lines; it will now just have the
  3 A/B/C lines, which seems like a more sensible interpretation.
- Img alt tag, which had been cast to float, is now used as a string.

Private member "encoding" is not used. Functions getEncoding and setEncoding
have therefore been marked deprecated. In fact, I was unable to get
SecurityScanner to pass *any* html which is not UTF-8. There are
possibly ways of getting around this (in Reader/Html - I have no
intention of messing with Security Scanner), as can be seen in my
companion pull request for Excel2003 Xml Reader. Doing this would be
easier for ASCII-compatible character sets (like ISO-8859-1),
than for non-compatible charsets (like UTF-16). I am not
convinced that the effort is worth it, but am willing to investigate
further.

I added a number of tests, creating an Html directory, and moving
HtmlTest to that directory.
2020-06-25 22:42:38 -07:00

93 lines
3.2 KiB
PHP

<?php
namespace PhpOffice\PhpSpreadsheetTests\Reader\Html;
use PhpOffice\PhpSpreadsheet\Reader\Exception as ReaderException;
use PhpOffice\PhpSpreadsheet\Reader\Html;
use PHPUnit\Framework\TestCase;
class HtmlLoadStringTest extends TestCase
{
public function testCanLoadFromString(): void
{
$html = '<table>
<tr>
<td>Hello World</td>
</tr>
<tr>
<td>Hello<br />World</td>
</tr>
<tr>
<td>Hello<br>World</td>
</tr>
</table>';
$spreadsheet = (new Html())->loadFromString($html);
$firstSheet = $spreadsheet->getSheet(0);
$cellStyle = $firstSheet->getStyle('A1');
self::assertFalse($cellStyle->getAlignment()->getWrapText());
$cellStyle = $firstSheet->getStyle('A2');
self::assertTrue($cellStyle->getAlignment()->getWrapText());
$cellValue = $firstSheet->getCell('A2')->getValue();
self::assertStringContainsString("\n", $cellValue);
$cellStyle = $firstSheet->getStyle('A3');
self::assertTrue($cellStyle->getAlignment()->getWrapText());
$cellValue = $firstSheet->getCell('A3')->getValue();
self::assertStringContainsString("\n", $cellValue);
}
public function testLoadInvalidString(): void
{
$this->expectException(ReaderException::class);
$html = '<table<>';
$spreadsheet = (new Html())->loadFromString($html);
$firstSheet = $spreadsheet->getSheet(0);
$cellStyle = $firstSheet->getStyle('A1');
self::assertFalse($cellStyle->getAlignment()->getWrapText());
}
public function testCanLoadFromStringIntoExistingSpreadsheet(): void
{
$html = '<table>
<tr>
<td>Hello World</td>
</tr>
<tr>
<td>Hello<br />World</td>
</tr>
<tr>
<td>Hello<br>World</td>
</tr>
</table>';
$reader = new Html();
$spreadsheet = $reader->loadFromString($html);
$firstSheet = $spreadsheet->getSheet(0);
$cellStyle = $firstSheet->getStyle('A1');
self::assertFalse($cellStyle->getAlignment()->getWrapText());
$cellStyle = $firstSheet->getStyle('A2');
self::assertTrue($cellStyle->getAlignment()->getWrapText());
$cellValue = $firstSheet->getCell('A2')->getValue();
self::assertStringContainsString("\n", $cellValue);
$cellStyle = $firstSheet->getStyle('A3');
self::assertTrue($cellStyle->getAlignment()->getWrapText());
$cellValue = $firstSheet->getCell('A3')->getValue();
self::assertStringContainsString("\n", $cellValue);
$reader->setSheetIndex(1);
$html = '<table>
<tr>
<td>Goodbye World</td>
</tr>
</table>';
self::assertEquals(1, $spreadsheet->getSheetCount());
$spreadsheet = $reader->loadFromString($html, $spreadsheet);
self::assertEquals(2, $spreadsheet->getSheetCount());
}
}