PhpSpreadsheet

Author	SHA1	Message	Date
oleibman	497a934374	Fix for 3 Issues Involving ReadXlsx and NamedRange (#1742 ) * Fix for 3 Issues Involving ReadXlsx and NamedRange Issues #1686 and #1723, which provide sample spreadsheets, are probably solved by this ticket. Issue #1730 is also probably solved, but I have no way to verify. There are two problems with how PhpSpreadsheet is handling things now. Although the first problem is much less severe, and isn't really a factor in the issues named above, it is helpful to get it out of the way first. If you define a named range in Excel, and then delete the sheet where the range exists, Excel saves the range as #REF!. If there is a cell which references the range, it will similarly have the value #REF! when you open the Excel file. Currently, PhpSpreadsheet discards the #REF! definition, so a cell which references the range will appear as #NAME? rather than #REF!. This PR changes the behavior so that PhpSpreadsheet retains the #REF! definition, and cells which reference it will appear as #REF!. The second problem is the more severe, and is, I believe, responsible for the 3 issues identified above. If you define a named range and the sheet on which the range is defined does not exist at the time, Excel will save the range as something like: '[1]Unknown Sheet'!$A$1 If a cell references such a range, Excel will again display #REF!. PhpSpreadsheet currently throws an Exception when it encounters such a definition while reading the file. This PR changes the behavior so that PhpSpreadsheet saves the definition as #REF!, and cells which reference it will behave similarly. For the record, I will note that Excel does not magically recalculate when a missing sheet is subsequently added, despite the fact that the reference might now become resolvable. PhpSpreadsheet behaves likewise. * Remove Dead Code in Test Identified it after push but before merge.	2020-12-10 18:08:10 +01:00
oleibman	1741766a9c	Improving Coverage for Excel2003 XML Reader (#1557 ) * Improving Coverage for Excel2003 XML Reader Reader/Xml is now 100% covered. File templates/Excel2003XMLTest.xml, used in some tests, is not readable by a current version of Excel. I have substituted a new file excel2003.xml to be used in its place. I have not deleted the original in case someone in future (possibly me) wants to see what it needs to make it usable. There are minimal code changes. - Unused protected functions pixel2WidthUnits and widthUnits2Pixel are deleted. - One regex looking to convert hex characters is changed from a-z to a-f, and made case insensitive. - No calculation performed for "error" cell (previously calculation was attempted and threw exception). - Empty relative row/cell is now handled correctly. - Style applied to empty cell when appropriate. - Support added for textRotation. - Support added for border styles. - Support added for diagonal borders. - Support added for superscript and subscript. - Support added for fill patterns. In theory, encodings other than UTF-8 were supported. In fact, I was unable to get SecurityScanner to pass any xml which is not UTF-8. Eliminating the assumption that strings might not be UTF-8 allowed much of the code to be greatly simplified. After that, I added some code that would permit the use of some ASCII-compatible encodings (there is a test of ISO-8859-1). It would be more difficult to handle other encodings (such as UTF-16). I am not convinced that even the ISO-8859 effort is worth it, but am willing to investigate either expanding or eliminating non-UTF8 support. I added a number of tests, creating an Xml directory, and moving XmlTest to that directory. Pull Request had problems reading old invalid sample in the code coverage phase, not in any of the other test phases, and not in the code coverage phase on my local machine. As it turns out, aside from being invalid, the sample is much larger than any of the other samples. Tests have been adjusted accordingly. * Smaller Test File Should eliminate need to avoid test during xml coverage. * Break Up Style Test into Multiple Tests Per suggestion from Mark Baker. * Integrate AddressHelper Change The introduction of AddressHelper introduced a conflict which needed to be resolved. I wanted to test it locally before resolving. This required me to add (unchanged) AddressHelper to my local copy. I hope this is an okay manner of resolving the conflict. * Weird Travis Error XmlOddTest works just fine on my local machine, but Travis failed it. Even worse, the lines which Travis flags don't even make any sense (one was the empty line between two methods!). This test is not essential to the rest of the change. I am removing it from the package, and will attempt to re-add it when I have a chance to sync up my fork with the main project.	2020-10-11 13:26:56 +02:00
Adrien Crivelli	4739f8b2e7	Merge branch 'readhtml'	2020-07-26 13:11:15 +09:00
oleibman	735103c120	Improve Coverage for ODS Reader (#1545 ) * Improve Coverage for ODS Reader Reader/ODS/Properties is now 100% covered. Reader/ODS is covered except for 1 statement. As the original author put it, "table-header-rows TODO: figure this out ... I'm not sure that PhpExcel has an API for this". I'm still thinking about it, but, so far, I agree with the author. There are minimal code changes. - Several places test !zip->open() to see whether the test failed. However, zip->open() returns true or a string, so the test never detects failure. Change to zip->open() !== true. No previous tests. - Suppress warning messages from simplexml_load_string (there had been no tests for invalid xml). - One document property was misnamed, and one non-existent property was tested for. I added a number of tests, creating an ODS directory, and moving OdsTest to that directory. * Scrutinizer Recommendation Unused variable in one test. * Update CHANGELOG Co-authored-by: Adrien Crivelli <adrien.crivelli@gmail.com>	2020-07-26 12:40:49 +09:00
MarkBaker	d009347e25	Forgot to check in the test files for the unit tests	2020-07-05 16:28:46 +02:00
Owen Leibman	6080c4561d	Improve Coverage for HTML Reader Reader/Html is now covered except for 1 statement. There is some coverage of RichText when you know in advance that the html will expand into a single cell. It is a tougher nut, one that I have not yet cracked, to try to handle rich text while converting unkown html to multiple cells. The original author left this as a TODO, and so for now must I. It made sense to restructure some of the code. There are some changes. - Issue #1532 is fixed (links are now saved when using rowspan). - Colors can now be specified as html color name. To accomplish this, Helper/Html function colourNameLookup was changed from protected to public, and changed to static. - Superfluous empty lines were eliminated in a number of places, e.g. <ul><li>A</li><li>B</li><li>C</li></ul> had formerly caused a wrapped cell to be created with 2 empty lines followed by A, B, and C on separate lines; it will now just have the 3 A/B/C lines, which seems like a more sensible interpretation. - Img alt tag, which had been cast to float, is now used as a string. Private member "encoding" is not used. Functions getEncoding and setEncoding have therefore been marked deprecated. In fact, I was unable to get SecurityScanner to pass any html which is not UTF-8. There are possibly ways of getting around this (in Reader/Html - I have no intention of messing with Security Scanner), as can be seen in my companion pull request for Excel2003 Xml Reader. Doing this would be easier for ASCII-compatible character sets (like ISO-8859-1), than for non-compatible charsets (like UTF-16). I am not convinced that the effort is worth it, but am willing to investigate further. I added a number of tests, creating an Html directory, and moving HtmlTest to that directory.	2020-06-25 22:42:38 -07:00
oleibman	38fab4e632	Fix for #1505 (#1525 ) This problem is the same as #1238, which was resolved by #1239. For that issue, the fix was to check in one place whether $this->mapCellXfIndex[$xfIndex] was set before using it. The sample spreadsheet supplied as a description for this problem had exactly the same problem in 2 other places in the code. In addition, there were 7 other places in the code where that particular item was used unchecked. This fix corrects all 9 locations. The spreadsheet supplied with the problem is used as the basis for some new tests, which particularly test column dimensions and styles, the problems involved in this case.	2020-06-19 21:01:18 +02:00
oleibman	41b95c1542	CSV Sample File Was Miscoded (#1489 ) File author erroneously assumed that backslash was used to escape quotes in CSV; in fact, doubling the quote is used for escape. The test still worked, but mainly because the content of the cell with the escape wasn't tested. The file is now fixed, and a new test added.	2020-05-24 19:57:39 +09:00
oleibman	7517cdd008	Improve Coverage for CSV (#1475 ) I believe that both CSV Reader and Writer are 100% covered now. There were some errors uncovered during development. The reader specifically permits encodings other than UTF-8 to be used. However, fgetcsv will not properly handle other encodings. I tried replacing it with fgets/iconv/strgetcsv, but that could not handle line breaks within a cell, even for UTF-8. This is, I'm sure, a very rare use case. I eventually handled it by using php://memory to hold the translated file contents for non-UTF8. There were no tests for this situation, and now there are (probably too many). "Contiguous" read was not handle correctly. There is a file in samples which uses it. It was designed to read a large sheet, and split it into three. The first sheet was corrrect, but the second and third were almost entirely empty. This has been corrected, and the sample code was adapted into a formal test with assertions to confirm that it works as designed. I made a minor documentation change. Unlike HTML, where you never need a BOM because you can declare the encoding in the file, a CSV with non-ASCII characters must explicitly include a BOM for Excel to handle it correctly. This was explained in the Reading CSV section, but was glossed over in the Writing CSV section, which I have updated.	2020-05-17 18:15:18 +09:00
oleibman	082266aacd	Conditionals - Extend Support for (NOT)CONTAINSBLANKS (#1278 ) Support for the CONTAINSBLANKS conditional style was added a while ago. However, that support was on write only; any cells which used CONTAINSBLANKS on a file being read would drop that style. I am also adding support for NOTCONTAINSBLANKS, on read and write.	2020-01-04 18:50:04 +01:00
oleibman	afd070a756	Handle ConditionalStyle NumberFormat When Reading Xlsx File (#1296 ) * Handle ConditionalStyle NumberFormat When Reading Xlsx File ReadStyle in Reader/Xlsx/Styles.php expects numberFormat to be a string. However, when reading conditional style in Xlsx file, NumberFormat is actually a SimpleXMLElement, so is not handled correctly. While testing this change, it turned out that reader always expects that there is a "SharedString" portion of the XML, which is not true for spreadsheets with no string data, which causes a run-time message. Likewise, when conditional number format is not one of the built-in formats, a run-time message is issued because 'isset' is used to determine existence rather than 'array_key_exists'. The new workbook added to the testing data demonstrates both those problems (prior to the code changes). * Move Comment to Resolve Conflict Github reports conflict involving placement of one comment statement. * Respond to Scrutinizer Style Suggestion Change detection for empty SimpleXMLElement.	2020-01-04 00:10:41 +01:00
Mahmoud Abdo	785705b712	Best effort to support invalid colspan values in HTML reader Closes #878	2019-07-27 23:31:23 -07:00
Mark Baker	d8047b071b	Basic unit test and fix for loading data validations from xlsx file (#1063 )	2019-07-08 19:55:14 +02:00
Mark Baker	0e6238c69e	CVE-2019-12331 (#1041 ) * Detect doubly-encoded xml to hide XXE attacks Correct use of LibXml_Disable_Entity_Loader * New test for double-encoded xml in security scanner	2019-07-01 00:55:25 +02:00
Mark Baker	1e711541f1	Refactoring xlsx reader (#1033 ) Start work on breaking up monolithic Reader and Writer classes into dedicated subclasses to make maintenance work easier	2019-06-30 23:42:25 +02:00
Mark Baker	6c25b6f422	Refactor Xlsx Properties Reader code into a separate class (#1001 ) * Unit tests for refactoring Spreadsheet properties * Refactor Xlsx Properties Reader code into a separate class	2019-06-10 16:44:55 +02:00
kraser	906bdc613c	Fix failure when parsing xlsx with drawing having double (redefined) … (#945 ) * Fix failure when parsing xlsx with drawing having double (redefined) attributes * Fix failure when parsing xlsx with drawing having double (redefined) attributes	2019-05-30 11:42:00 +02:00
AlexPravdin	ebc0b56959	Fix #853 when loading and saving XLSX file with empty drawing cause c… (#882 ) * Fix #853 when loading and saving XLSX file with empty drawing cause corrupted output file. Store empty drawing as unparsed entity and save it as is when saving the file. * Fix code style	2019-05-30 10:38:03 +02:00
Mark Baker	9b004b1e6a	Ignore escaped enclosures within an enclosure when inferring csv separator (#906 )	2019-02-25 23:20:50 +01:00
Patrick Brouwers	1c99f4999c	[Feature] Html reader improvements (#884 ) * Extract character set, so we can convert to UTF-8 if required * Set column width and row height when defined on tr/td * Parse align and valign on td * Specify number format of cell via html attribute * Formatting of b, strong, i and em tags * Inserting image in cell when using img tag in html * Add applying inline styles: border, fonts, alignment, dimensions * Add tests for applying inline styles	2019-02-16 23:11:16 +01:00
MarkBaker	41bcf9a21c	Support for additional callback in XML Security Scanner	2018-11-25 14:00:35 +01:00
MarkBaker	7a06d71e1c	Add UTF-7 XXE Unit test data	2018-11-19 23:22:59 +01:00
Laurent	79d86ef5cc	Csv reader avoid notice when the file is empty Fixes #337	2018-10-28 14:16:53 +11:00
Paul Barton	813855b2b2	Fix CSV delimiter detection on line breaks The CSV Reader can now correctly ignore line breaks inside enclosures which allows it to determine the delimiter correctly. Fixes #716 Fixes #717	2018-10-21 18:23:55 +11:00
bayzhanov	08b4456641	Xls file threw exception during open by Xls reader Ignore some exception in property, if stream is empty Fixes #402 Fixes #659	2018-10-07 18:49:01 +11:00
Adrien Crivelli	9fdcaabe3c	Could not open CSV file containing HTML fragment We now always trust the file extension to avoid false positive of mime detection for most simple cases. But we still try to guess the mime type if the file extension does not match or is missing. Fixes #564	2018-06-25 11:12:27 +09:00
Robin D'Arcy	c723833d6f	Allow CSV escape character to be set Fixes #492 Closes #510	2018-05-23 10:31:41 +09:00
Adrien Crivelli	e31878ceb1	Check for MIME type to know if CSV reader can read a file CSV reader used to accept any file without any kind of check. That made users incorrectly believe that things were ok, even though there is no way for CSV reader to read anything else that plain text files. Fixes #167	2018-02-05 21:33:23 +09:00
Adrien Crivelli	481fc4a7c6	Support XML file without styles Closes #331 Closes https://github.com/PHPOffice/PHPExcel/pull/559 Fixes https://github.com/PHPOffice/PHPExcel/issues/558	2018-01-14 17:08:50 +09:00
Adrien Crivelli	139d85d874	Better auto-detection of CSV separators Closes #305	2017-12-28 12:25:37 +09:00
GreatHumorist	2abe56b946	Support missing attribute `r` in `c` node when reading xlsx When describing a cell, the cell reference (r="A1") is optional. When not present, we should just increment the index of the last processed row. Fixes #201 Closes #225	2017-09-22 14:49:38 +09:00
GreatHumorist	0477e6fcfe	In Xml reader throw exception in case of invalid XML (#222 ) When the xml file is not a standard xml file, the `simplexml_load_string` will return false, this will cause an error on "$xml->getNamespaces(true);" . So instead of showing the error, we throw an exception.	2017-09-20 14:20:12 +09:00
Markus Lanthaler	3ee9cc5ce6	Infer CSV delimiter if it hasn't been set explicitly Closes #141	2017-04-20 17:02:03 +09:00
Paolo Agostinetto	c954eddf57	Ods reader: fix sheet count and added a test for sheet names	2017-02-20 21:02:04 +01:00
Paolo Agostinetto	1dba2d1766	Ods reader: tests for repeated spaces and rich text	2017-02-18 20:49:48 +01:00
Paolo Agostinetto	bcd1bc364c	Ods reader: test loading of Worksheets	2017-02-18 13:55:22 +01:00
Paolo Agostinetto	3c7b2e23da	Added unit tests for Ods reader	2017-02-18 13:36:08 +01:00
Adrien Crivelli	e6bbc4bd25	Convert all line ending to unix style	2016-11-27 15:45:15 +09:00
Alexander Kurilo	408da0c17a	Make HTML checks more strict	2016-11-16 22:21:30 +09:00
Alexander Kurilo	edb3974a0d	Move XEEE test data to add data for other readers	2016-11-16 22:21:30 +09:00
Adrien Crivelli	e1f81f0fe0	Refactor tests data from custom format to PHP FIX #14	2016-08-16 21:00:19 +09:00

41 Commits