* Fix for 3 Issues Involving ReadXlsx and NamedRange
Issues #1686 and #1723, which provide sample spreadsheets, are probably
solved by this ticket. Issue #1730 is also probably solved, but I have
no way to verify.
There are two problems with how PhpSpreadsheet is handling things now.
Although the first problem is much less severe, and isn't really a factor
in the issues named above, it is helpful to get it out of the way first.
If you define a named range in Excel, and then delete the sheet where
the range exists, Excel saves the range as #REF!. If there is a cell which
references the range, it will similarly have the value #REF! when you open
the Excel file.
Currently, PhpSpreadsheet discards the #REF! definition, so a cell which
references the range will appear as #NAME? rather than #REF!.
This PR changes the behavior so that PhpSpreadsheet retains the #REF!
definition, and cells which reference it will appear as #REF!.
The second problem is the more severe, and is, I believe, responsible
for the 3 issues identified above.
If you define a named range and the sheet on which the range is defined
does not exist at the time, Excel will save the range as something like:
'[1]Unknown Sheet'!$A$1
If a cell references such a range, Excel will again display #REF!.
PhpSpreadsheet currently throws an Exception when it encounters
such a definition while reading the file. This PR changes
the behavior so that PhpSpreadsheet saves the definition as #REF!,
and cells which reference it will behave similarly.
For the record, I will note that Excel does not magically recalculate when a
missing sheet is subsequently added, despite the fact that the reference
might now become resolvable. PhpSpreadsheet behaves likewise.
* Remove Dead Code in Test
Identified it after push but before merge.
* Improving Coverage for Excel2003 XML Reader
Reader/Xml is now 100% covered.
File templates/Excel2003XMLTest.xml, used in some tests, is *not*
readable by a current version of Excel. I have substituted a new file
excel2003.xml to be used in its place. I have not deleted the original
in case someone in future (possibly me) wants to see what it needs to
make it usable.
There are minimal code changes.
- Unused protected functions pixel2WidthUnits and widthUnits2Pixel
are deleted.
- One regex looking to convert hex characters is changed from a-z to a-f,
and made case insensitive.
- No calculation performed for "error" cell (previously calculation
was attempted and threw exception).
- Empty relative row/cell is now handled correctly.
- Style applied to empty cell when appropriate.
- Support added for textRotation.
- Support added for border styles.
- Support added for diagonal borders.
- Support added for superscript and subscript.
- Support added for fill patterns.
In theory, encodings other than UTF-8 were supported.
In fact, I was unable to get SecurityScanner to pass *any* xml which is
not UTF-8. Eliminating the assumption that strings might not be UTF-8
allowed much of the code to be greatly simplified.
After that, I added some code that would permit the use of
some ASCII-compatible encodings (there is a test of ISO-8859-1).
It would be more difficult to handle other encodings (such as UTF-16).
I am not convinced that even the ISO-8859 effort is worth it,
but am willing to investigate either expanding or eliminating
non-UTF8 support.
I added a number of tests, creating an Xml directory, and moving
XmlTest to that directory.
Pull Request had problems reading old invalid sample in the code
coverage phase, not in any of the other test phases, and not in
the code coverage phase on my local machine.
As it turns out, aside from being invalid, the sample
is much larger than any of the other samples. Tests have been
adjusted accordingly.
* Smaller Test File
Should eliminate need to avoid test during xml coverage.
* Break Up Style Test into Multiple Tests
Per suggestion from Mark Baker.
* Integrate AddressHelper Change
The introduction of AddressHelper introduced a conflict which needed to
be resolved. I wanted to test it locally before resolving. This required
me to add (unchanged) AddressHelper to my local copy. I hope this is
an okay manner of resolving the conflict.
* Weird Travis Error
XmlOddTest works just fine on my local machine, but Travis failed it.
Even worse, the lines which Travis flags don't even make any sense
(one was the empty line between two methods!).
This test is not essential to the rest of the change. I am removing
it from the package, and will attempt to re-add it when I have a chance
to sync up my fork with the main project.
* Improve Coverage for ODS Reader
Reader/ODS/Properties is now 100% covered.
Reader/ODS is covered except for 1 statement. As the original author
put it, "table-header-rows TODO: figure this out ... I'm not sure that
PhpExcel has an API for this". I'm still thinking about it, but, so far,
I agree with the author.
There are minimal code changes.
- Several places test !zip->open() to see whether the test failed.
However, zip->open() returns true or a string, so the test never
detects failure. Change to zip->open() !== true. No previous tests.
- Suppress warning messages from simplexml_load_string (there had
been no tests for invalid xml).
- One document property was misnamed, and one non-existent property
was tested for.
I added a number of tests, creating an ODS directory, and moving
OdsTest to that directory.
* Scrutinizer Recommendation
Unused variable in one test.
* Update CHANGELOG
Co-authored-by: Adrien Crivelli <adrien.crivelli@gmail.com>
Reader/Html is now covered except for 1 statement.
There is some coverage of RichText when you know in advance that the
html will expand into a single cell.
It is a tougher nut, one that I have not yet cracked,
to try to handle rich text while converting unkown html to multiple cells.
The original author left this as a TODO, and so for now must I.
It made sense to restructure some of the code. There are some changes.
- Issue #1532 is fixed (links are now saved when using rowspan).
- Colors can now be specified as html color name. To accomplish this,
Helper/Html function colourNameLookup was changed from protected
to public, and changed to static.
- Superfluous empty lines were eliminated in a number of places, e.g.
<ul><li>A</li><li>B</li><li>C</li></ul>
had formerly caused a wrapped cell to be created with 2 empty lines
followed by A, B, and C on separate lines; it will now just have the
3 A/B/C lines, which seems like a more sensible interpretation.
- Img alt tag, which had been cast to float, is now used as a string.
Private member "encoding" is not used. Functions getEncoding and setEncoding
have therefore been marked deprecated. In fact, I was unable to get
SecurityScanner to pass *any* html which is not UTF-8. There are
possibly ways of getting around this (in Reader/Html - I have no
intention of messing with Security Scanner), as can be seen in my
companion pull request for Excel2003 Xml Reader. Doing this would be
easier for ASCII-compatible character sets (like ISO-8859-1),
than for non-compatible charsets (like UTF-16). I am not
convinced that the effort is worth it, but am willing to investigate
further.
I added a number of tests, creating an Html directory, and moving
HtmlTest to that directory.
This problem is the same as #1238, which was resolved by #1239.
For that issue, the fix was to check in one place whether
$this->mapCellXfIndex[$xfIndex] was set before using it.
The sample spreadsheet supplied as a description for this
problem had exactly the same problem in 2 other places in the code.
In addition, there were 7 other places in the code where that
particular item was used unchecked. This fix corrects all 9 locations.
The spreadsheet supplied with the problem is used as the basis
for some new tests, which particularly test column dimensions
and styles, the problems involved in this case.
* Improve Coverage for Sylk
I believe that both BaseReader and Sylk Reader are now 100% covered.
Documentation available for this format is sparse.
It was always incomplete, and in some cases inaccurate.
My goal was to use PhpSpreadsheet to load the test file,
save it as Xlsx, and visually compare the two, then add a test
loaded with assertions. Cell values and calculated values,
and border styles were generally handled pretty well without changes.
Other types of styling were not handled so well. I added a few cells
to exercise some previously uncovered code.
Sylk files must be ASCII. I have deprecated the use of the
setEncoding and getEncoding functions, which had no test cases.
* Improve Coverage for Gnumeric
I believe that both BaseReader and Gnumeric Reader are now 100% covered.
My goal was to use PhpSpreadsheet to load the test file,
save it as Xlsx, and visually compare the two, then add a test
loaded with assertions. Results were generally pretty good,
but there were no tests with assertions. I added a few cells
to exercise some previously uncovered code. Code was extensively
refactored; logic changes are noted below.
Code allowed for specifying document properties in an old format.
I considered removing that, but I found the original spec at
http://www.jfree.org/jworkbook/download/gnumeric-xml.pdf
This allowed me to create an old file, which was not handled
correctly because of namespace differences. The code was corrected
to allow for this difference.
Added support for textRotation.
Mapping of fill types was not correct.
* PHP7.2 Error
One assertion failed under PHP7.2. Apparently there was some change in
the handling of SimpleXMLElement between 7.2 and 7.3. Casting to string
before use eliminates the problem.
* Scrutinizer Recommendations
All minor, solved (hopefully) mostly by casts.
* One Last Scrutinizer Fix
... I hope.
No code changes. The tests in all of these scripts write to at least
one temporary file, which is then read and not used again. The file
should be deleted to avoid filling up the disk system.
File author erroneously assumed that backslash was used to escape
quotes in CSV; in fact, doubling the quote is used for escape.
The test still worked, but mainly because the content of the cell
with the escape wasn't tested. The file is now fixed, and
a new test added.
I believe that both CSV Reader and Writer are 100% covered now.
There were some errors uncovered during development.
The reader specifically permits encodings other than UTF-8 to be used.
However, fgetcsv will not properly handle other encodings.
I tried replacing it with fgets/iconv/strgetcsv, but that could not
handle line breaks within a cell, even for UTF-8.
This is, I'm sure, a very rare use case.
I eventually handled it by using php://memory to hold the translated
file contents for non-UTF8. There were no tests for this situation,
and now there are (probably too many).
"Contiguous" read was not handle correctly. There is a file
in samples which uses it. It was designed to read a large sheet,
and split it into three. The first sheet was corrrect, but the
second and third were almost entirely empty. This has been corrected,
and the sample code was adapted into a formal test with assertions
to confirm that it works as designed.
I made a minor documentation change. Unlike HTML, where you never
need a BOM because you can declare the encoding in the file,
a CSV with non-ASCII characters must explicitly include a BOM
for Excel to handle it correctly. This was explained in the Reading CSV
section, but was glossed over in the Writing CSV section, which I
have updated.
Indentation in the xml leaves spaces in style string even after
replacing newlines. Replacing the spaces ensures no spaces in keys
of the resulting style-array
Fixes#1347
The `setRange` method of the `Xlsx/AutoFilter` class expects a filter
range format like "A1:E10". The returned value from
`$this->worksheetXml->autoFilter['ref']` could contain "$" and returning
a value like "$A$1:$E$10".
Fixes#687Fixes#1325Closes#1326
Support for the CONTAINSBLANKS conditional style was added a while ago.
However, that support was on write only; any cells which used
CONTAINSBLANKS on a file being read would drop that style.
I am also adding support for NOTCONTAINSBLANKS, on read and write.
* Handle ConditionalStyle NumberFormat When Reading Xlsx File
ReadStyle in Reader/Xlsx/Styles.php expects numberFormat to be a string.
However, when reading conditional style in Xlsx file, NumberFormat
is actually a SimpleXMLElement, so is not handled correctly.
While testing this change, it turned out that reader always expects
that there is a "SharedString" portion of the XML, which is not
true for spreadsheets with no string data, which causes a
run-time message.
Likewise, when conditional number format is not one of the built-in
formats, a run-time message is issued because 'isset' is used
to determine existence rather than 'array_key_exists'.
The new workbook added to the testing data demonstrates both those
problems (prior to the code changes).
* Move Comment to Resolve Conflict
Github reports conflict involving placement of one comment statement.
* Respond to Scrutinizer Style Suggestion
Change detection for empty SimpleXMLElement.
We often want to export a table as an excel sheet. The system renders the
html and it seems like a waste of time to write it to the file system to
use the reader. This allows us to render the html and then just pass it to
a reader
Closes#1136
* When <br> appears in a table cell, set the cell to wrap.
If the cell is not set to wrap, it appears as a single line when first
displayed in Excel, although editing the cell will cause Excel to wrap
it.
* fix whitespace
Upstream has a coding standard that includes whitespace
* Add Unit tests for cell wrapping
* Update changelog
XmlScanner was not restoring libxml_disable_entity_loader since
destruct was not being called until script shutdown. This is because
the shutdown handler required an XmlScanner instance.
Also fix an unrelated bug where the UTF-8 encoding test was
case sensitive.
* Fix failure when parsing xlsx with drawing having double (redefined) attributes
* Fix failure when parsing xlsx with drawing having double (redefined) attributes
* Fix#853 when loading and saving XLSX file with empty drawing cause corrupted output file. Store empty drawing as unparsed entity and save it as is when saving the file.
* Fix code style
* Extract character set, so we can convert to UTF-8 if required
* Set column width and row height when defined on tr/td
* Parse align and valign on td
* Specify number format of cell via html attribute
* Formatting of b, strong, i and em tags
* Inserting image in cell when using img tag in html
* Add applying inline styles: border, fonts, alignment, dimensions
* Add tests for applying inline styles