Wrap lines to 80

We wrapped lines to make easier the source easier to read, and to have more manageable diffs.

This was done with something like:

```sh
pandoc --wrap=auto --atx-headers -f markdown -t markdown-fenced_code_attributes+pipe_tables+raw_html+intraword_underscores`
```
This commit is contained in:
Adrien Crivelli 2016-12-04 00:00:54 +09:00
parent 17d1976526
commit a06731fcc6
No known key found for this signature in database
GPG Key ID: B182FD79DC6DE92E
9 changed files with 2627 additions and 1266 deletions

View File

@ -1,20 +1,27 @@
# Frequently asked questions # Frequently asked questions
The up-to-date F.A.Q. page for PHPExcel can be found on [http://www.codeplex.com/PHPExcel/Wiki/View.aspx?title=FAQ&referringTitle=Requirements](http://www.codeplex.com/PHPExcel/Wiki/View.aspx?title=FAQ&referringTitle=Requirements). The up-to-date F.A.Q. page for PHPExcel can be found on
<http://www.codeplex.com/PHPExcel/Wiki/View.aspx?title=FAQ&referringTitle=Requirements>.
## There seems to be a problem with character encoding... ## There seems to be a problem with character encoding...
It is necessary to use UTF-8 encoding for all texts in PhpSpreadsheet. If the script uses different encoding then you can convert those texts with PHP's iconv() or mb_convert_encoding() functions. It is necessary to use UTF-8 encoding for all texts in PhpSpreadsheet.
If the script uses different encoding then you can convert those texts
with PHP's iconv() or mb\_convert\_encoding() functions.
## PHP complains about ZipArchive not being found ## PHP complains about ZipArchive not being found
Make sure you meet all requirements, especially php_zip extension should be enabled. Make sure you meet all requirements, especially php\_zip extension
should be enabled.
The ZipArchive class is only required when reading or writing formats that use Zip compression (Xlsx and Ods). Since version 1.7.6 the PCLZip library has been bundled with PhpSpreadsheet as an alternative to the ZipArchive class. The ZipArchive class is only required when reading or writing formats
that use Zip compression (Xlsx and Ods). Since version 1.7.6 the PCLZip
library has been bundled with PhpSpreadsheet as an alternative to the
ZipArchive class.
This can be enabled by calling: This can be enabled by calling:
```php ``` php
\PhpOffice\PhpSpreadsheet\Settings::setZipClass(\PhpOffice\PhpSpreadsheet\Settings::PCLZIP); \PhpOffice\PhpSpreadsheet\Settings::setZipClass(\PhpOffice\PhpSpreadsheet\Settings::PCLZIP);
``` ```
@ -22,73 +29,120 @@ This can be enabled by calling:
You can revert to using ZipArchive by calling: You can revert to using ZipArchive by calling:
```php ``` php
\PhpOffice\PhpSpreadsheet\Settings::setZipClass(\PhpOffice\PhpSpreadsheet\Settings::ZIPARCHIVE); \PhpOffice\PhpSpreadsheet\Settings::setZipClass(\PhpOffice\PhpSpreadsheet\Settings::ZIPARCHIVE);
``` ```
At present, this only allows you to write Xlsx files without the need for ZipArchive (not to read Xlsx or Ods) At present, this only allows you to write Xlsx files without the need
for ZipArchive (not to read Xlsx or Ods)
## Excel 2007 cannot open the file generated on Windows ## Excel 2007 cannot open the file generated on Windows
"Excel found unreadable content in '*.xlsx'. Do you want to recover the contents of this workbook? If you trust the source of this workbook, click Yes." "Excel found unreadable content in '\*.xlsx'. Do you want to recover the
contents of this workbook? If you trust the source of this workbook,
click Yes."
Some older versions of the 5.2.x php_zip extension on Windows contain an error when creating ZIP files. The version that can be found on [http://snaps.php.net/win32/php5.2-win32-latest.zip](http://snaps.php.net/win32/php5.2-win32-latest.zip) should work at all times. Some older versions of the 5.2.x php\_zip extension on Windows contain
an error when creating ZIP files. The version that can be found on
<http://snaps.php.net/win32/php5.2-win32-latest.zip> should work at all
times.
Alternatively, upgrading to at least PHP 5.2.9 should solve the problem. Alternatively, upgrading to at least PHP 5.2.9 should solve the problem.
If you can't locate a clean copy of ZipArchive, then you can use the PCLZip library as an alternative when writing Xlsx files, as described above. If you can't locate a clean copy of ZipArchive, then you can use the
PCLZip library as an alternative when writing Xlsx files, as described
above.
## Fatal error: Allowed memory size of xxx bytes exhausted (tried to allocate yyy bytes) in zzz on line aaa ## Fatal error: Allowed memory size of xxx bytes exhausted (tried to allocate yyy bytes) in zzz on line aaa
PhpSpreadsheet holds an "in memory" representation of a spreadsheet, so it is susceptible to PHP's memory limitations. The memory made available to PHP can be increased by editing the value of the memory_limit directive in your php.ini file, or by using ini_set('memory_limit', '128M') within your code (ISP permitting). PhpSpreadsheet holds an "in memory" representation of a spreadsheet, so
it is susceptible to PHP's memory limitations. The memory made available
to PHP can be increased by editing the value of the memory\_limit
directive in your php.ini file, or by using ini\_set('memory\_limit',
'128M') within your code (ISP permitting).
Some Readers and Writers are faster than others, and they also use differing amounts of memory. You can find some indication of the relative performance and memory usage for the different Readers and Writers, over the different versions of PhpSpreadsheet, on the [discussion board](http://phpexcel.codeplex.com/Thread/View.aspx?ThreadId=234150). Some Readers and Writers are faster than others, and they also use
differing amounts of memory. You can find some indication of the
relative performance and memory usage for the different Readers and
Writers, over the different versions of PhpSpreadsheet, on the
[discussion
board](http://phpexcel.codeplex.com/Thread/View.aspx?ThreadId=234150).
If you've already increased memory to a maximum, or can't change your memory limit, then [this discussion](http://phpexcel.codeplex.com/Thread/View.aspx?ThreadId=242712) on the board describes some of the methods that can be applied to reduce the memory usage of your scripts using PhpSpreadsheet. If you've already increased memory to a maximum, or can't change your
memory limit, then [this
discussion](http://phpexcel.codeplex.com/Thread/View.aspx?ThreadId=242712)
on the board describes some of the methods that can be applied to reduce
the memory usage of your scripts using PhpSpreadsheet.
## Protection on my worksheet is not working? ## Protection on my worksheet is not working?
When you make use of any of the worksheet protection features (e.g. cell range protection, prohibiting deleting rows, ...), make sure you enable worksheet security. This can for example be done like this: When you make use of any of the worksheet protection features (e.g. cell
range protection, prohibiting deleting rows, ...), make sure you enable
worksheet security. This can for example be done like this:
```php ``` php
$spreadsheet->getActiveSheet()->getProtection()->setSheet(true); $spreadsheet->getActiveSheet()->getProtection()->setSheet(true);
``` ```
## Feature X is not working with Reader_Y / Writer_Z ## Feature X is not working with Reader\_Y / Writer\_Z
Not all features of PhpSpreadsheet are implemented in all of the Reader / Writer classes. This is mostly due to underlying libraries not supporting a specific feature or not having implemented a specific feature. Not all features of PhpSpreadsheet are implemented in all of the Reader
/ Writer classes. This is mostly due to underlying libraries not
supporting a specific feature or not having implemented a specific
feature.
For example autofilter is not implemented in PEAR Spreadsheet_Excel_writer, which is the base of our Xls writer. For example autofilter is not implemented in PEAR
Spreadsheet\_Excel\_writer, which is the base of our Xls writer.
We are slowly building up a list of features, together with the different readers and writers that support them, in the "Functionality Cross-Reference.xls" file in the /Documentation folder. We are slowly building up a list of features, together with the
different readers and writers that support them, in the "Functionality
Cross-Reference.xls" file in the /Documentation folder.
## Formulas don't seem to be calculated in Excel2003 using compatibility pack? ## Formulas don't seem to be calculated in Excel2003 using compatibility pack?
This is normal behaviour of the compatibility pack, Xlsx displays this correctly. Use \PhpOffice\PhpSpreadsheet\Writer\Xls if you really need calculated values, or force recalculation in Excel2003. This is normal behaviour of the compatibility pack, Xlsx displays this
correctly. Use \PhpOffice\PhpSpreadsheet\Writer\Xls if you really need
calculated values, or force recalculation in Excel2003.
## Setting column width is not 100% accurate ## Setting column width is not 100% accurate
Trying to set column width, I experience one problem. When I open the file in Excel, the actual width is 0.71 less than it should be. Trying to set column width, I experience one problem. When I open the
file in Excel, the actual width is 0.71 less than it should be.
The short answer is that PhpSpreadsheet uses a measure where padding is included. See section: "Setting a column's width" for more details. The short answer is that PhpSpreadsheet uses a measure where padding is
included. See section: "Setting a column's width" for more details.
## How do I use PhpSpreadsheet with my framework ## How do I use PhpSpreadsheet with my framework
- There are some instructions for using PhpSpreadsheet with Joomla on the [Joomla message board](http://http:/forum.joomla.org/viewtopic.php?f=304&t=433060) - There are some instructions for using PhpSpreadsheet with Joomla on
- A page of advice on using [PhpSpreadsheet in the Yii framework](http://www.yiiframework.com/wiki/101/how-to-use-phpexcel-external-library-with-yii/) the [Joomla message
- [The Bakery](http://bakery.cakephp.org/articles/melgior/2010/01/26/simple-excel-spreadsheet-helper) has some helper classes for reading and writing with PhpSpreadsheet within CakePHP board](http://http:/forum.joomla.org/viewtopic.php?f=304&t=433060)
- Integrating [PhpSpreadsheet into Kohana 3](http://www.flynsarmy.com/2010/07/phpexcel-module-for-kohana-3/) and [Интеграция PHPExcel и Kohana Framework][http://szpargalki.blogspot.com/2011/02/phpexcel-kohana-framework.html] - A page of advice on using [PhpSpreadsheet in the Yii
- Using [PhpSpreadsheet with TYPO3](http://typo3.org/documentation/document-library/extension-manuals/phpexcel_library/1.1.1/view/toc/0/) framework](http://www.yiiframework.com/wiki/101/how-to-use-phpexcel-external-library-with-yii/)
- [The
Bakery](http://bakery.cakephp.org/articles/melgior/2010/01/26/simple-excel-spreadsheet-helper)
has some helper classes for reading and writing with PhpSpreadsheet
within CakePHP
- Integrating [PhpSpreadsheet into Kohana
3](http://www.flynsarmy.com/2010/07/phpexcel-module-for-kohana-3/)
and \[Интеграция PHPExcel и Kohana
Framework\]\[http://szpargalki.blogspot.com/2011/02/phpexcel-kohana-framework.html\]
- Using [PhpSpreadsheet with
TYPO3](http://typo3.org/documentation/document-library/extension-manuals/phpexcel_library/1.1.1/view/toc/0/)
## Joomla Autoloader interferes with PhpSpreadsheet Autoloader ## Joomla Autoloader interferes with PhpSpreadsheet Autoloader
Thanks to peterrlynch for the following advice on resolving issues between the [PhpSpreadsheet autoloader and Joomla Autoloader](http://phpexcel.codeplex.com/discussions/211925) Thanks to peterrlynch for the following advice on resolving issues
between the [PhpSpreadsheet autoloader and Joomla
Autoloader](http://phpexcel.codeplex.com/discussions/211925)
### Tutorials ### Tutorials
- [English PHPExcel tutorial](http://openxmldeveloper.org) - [English PHPExcel tutorial](http://openxmldeveloper.org)
- [French PHPExcel tutorial](http://g-ernaelsten.developpez.com/tutoriels/excel2007/) - [French PHPExcel
- [Russian PHPExcel Blog Postings](http://www.web-junior.net/sozdanie-excel-fajjlov-s-pomoshhyu-phpexcel/) tutorial](http://g-ernaelsten.developpez.com/tutoriels/excel2007/)
- [A Japanese-language introduction to PHPExcel](http://journal.mycom.co.jp/articles/2009/03/06/phpexcel/index.html) - [Russian PHPExcel Blog
Postings](http://www.web-junior.net/sozdanie-excel-fajjlov-s-pomoshhyu-phpexcel/)
- [A Japanese-language introduction to
PHPExcel](http://journal.mycom.co.jp/articles/2009/03/06/phpexcel/index.html)

File diff suppressed because it is too large Load Diff

View File

@ -1,37 +1,62 @@
# AutoFilter Reference # AutoFilter Reference
## Introduction ## Introduction
Each worksheet in an Excel Workbook can contain a single autoFilter range. Filtered data displays only the rows that meet criteria that you specify and hides rows that you do not want displayed. You can filter by more than one column: filters are additive, which means that each additional filter is based on the current filter and further reduces the subset of data. Each worksheet in an Excel Workbook can contain a single autoFilter
range. Filtered data displays only the rows that meet criteria that you
specify and hides rows that you do not want displayed. You can filter by
more than one column: filters are additive, which means that each
additional filter is based on the current filter and further reduces the
subset of data.
![01-01-autofilter.png](./images/01-01-autofilter.png "") ![01-01-autofilter.png](./images/01-01-autofilter.png)
When an AutoFilter is applied to a range of cells, the first row in an autofilter range will be the heading row, which displays the autoFilter dropdown icons. It is not part of the actual autoFiltered data. All subsequent rows are the autoFiltered data. So an AutoFilter range should always contain the heading row and one or more data rows (one data row is pretty meaningless), but PhpSpreadsheet won't actually stop you specifying a meaningless range: it's up to you as a developer to avoid such errors. When an AutoFilter is applied to a range of cells, the first row in an
autofilter range will be the heading row, which displays the autoFilter
dropdown icons. It is not part of the actual autoFiltered data. All
subsequent rows are the autoFiltered data. So an AutoFilter range should
always contain the heading row and one or more data rows (one data row
is pretty meaningless), but PhpSpreadsheet won't actually stop you
specifying a meaningless range: it's up to you as a developer to avoid
such errors.
To determine if a filter is applied, note the icon in the column heading. A drop-down arrow (![01-03-filter-icon-1.png](./images/01-03-filter-icon-1.png "")) means that filtering is enabled but not applied. In MS Excel, when you hover over the heading of a column with filtering enabled but not applied, a screen tip displays the cell text for the first row in that column, and the message "(Showing All)". To determine if a filter is applied, note the icon in the column
heading. A drop-down arrow
(![01-03-filter-icon-1.png](./images/01-03-filter-icon-1.png)) means
that filtering is enabled but not applied. In MS Excel, when you hover
over the heading of a column with filtering enabled but not applied, a
screen tip displays the cell text for the first row in that column, and
the message "(Showing All)".
![01-02-autofilter.png](./images/01-02-autofilter.png "") ![01-02-autofilter.png](./images/01-02-autofilter.png)
A Filter button
(![01-03-filter-icon-2.png](./images/01-03-filter-icon-2.png)) means
that a filter is applied. When you hover over the heading of a filtered
column, a screen tip displays the filter that has been applied to that
column, such as "Equals a red cell color" or "Larger than 150".
A Filter button (![01-03-filter-icon-2.png](./images/01-03-filter-icon-2.png "")) means that a filter is applied. When you hover over the heading of a filtered column, a screen tip displays the filter that has been applied to that column, such as "Equals a red cell color" or "Larger than 150". ![01-04-autofilter.png](./images/01-04-autofilter.png)
![01-04-autofilter.png](./images/01-04-autofilter.png "")
## Setting an AutoFilter area on a worksheet ## Setting an AutoFilter area on a worksheet
To set an autoFilter on a range of cells. To set an autoFilter on a range of cells.
```php ``` php
$spreadsheet->getActiveSheet()->setAutoFilter('A1:E20'); $spreadsheet->getActiveSheet()->setAutoFilter('A1:E20');
``` ```
The first row in an autofilter range will be the heading row, which displays the autoFilter dropdown icons. It is not part of the actual autoFiltered data. All subsequent rows are the autoFiltered data. So an AutoFilter range should always contain the heading row and one or more data rows (one data row is pretty meaningless, but PhpSpreadsheet won't actually stop you specifying a meaningless range: it's up to you as a developer to avoid such errors. The first row in an autofilter range will be the heading row, which
displays the autoFilter dropdown icons. It is not part of the actual
autoFiltered data. All subsequent rows are the autoFiltered data. So an
AutoFilter range should always contain the heading row and one or more
data rows (one data row is pretty meaningless, but PhpSpreadsheet won't
actually stop you specifying a meaningless range: it's up to you as a
developer to avoid such errors.
If you want to set the whole worksheet as an autofilter region If you want to set the whole worksheet as an autofilter region
```php ``` php
$spreadsheet->getActiveSheet()->setAutoFilter( $spreadsheet->getActiveSheet()->setAutoFilter(
$spreadsheet->getActiveSheet() $spreadsheet->getActiveSheet()
->calculateWorksheetDimension() ->calculateWorksheetDimension()
@ -40,52 +65,69 @@ $spreadsheet->getActiveSheet()->setAutoFilter(
This enables filtering, but does not actually apply any filters. This enables filtering, but does not actually apply any filters.
## Autofilter Expressions ## Autofilter Expressions
PHPEXcel 1.7.8 introduced the ability to actually create, read and write filter expressions; initially only for Xlsx files, but later releases will extend this to other formats. PHPEXcel 1.7.8 introduced the ability to actually create, read and write
filter expressions; initially only for Xlsx files, but later releases
will extend this to other formats.
To apply a filter expression to an autoFilter range, you first need to identify which column you're going to be applying this filter to. To apply a filter expression to an autoFilter range, you first need to
identify which column you're going to be applying this filter to.
```php ``` php
$autoFilter = $spreadsheet->getActiveSheet()->getAutoFilter(); $autoFilter = $spreadsheet->getActiveSheet()->getAutoFilter();
$columnFilter = $autoFilter->getColumn('C'); $columnFilter = $autoFilter->getColumn('C');
``` ```
This returns an autoFilter column object, and you can then apply filter expressions to that column. This returns an autoFilter column object, and you can then apply filter
expressions to that column.
There are a number of different types of autofilter expressions. The most commonly used are: There are a number of different types of autofilter expressions. The
most commonly used are:
- Simple Filters - Simple Filters
- DateGroup Filters - DateGroup Filters
- Custom filters - Custom filters
- Dynamic Filters - Dynamic Filters
- Top Ten Filters - Top Ten Filters
These different types are mutually exclusive within any single column. You should not mix the different types of filter in the same column. PhpSpreadsheet will not actively prevent you from doing this, but the results are unpredictable. These different types are mutually exclusive within any single column.
You should not mix the different types of filter in the same column.
Other filter expression types (such as cell colour filters) are not yet supported. PhpSpreadsheet will not actively prevent you from doing this, but the
results are unpredictable.
Other filter expression types (such as cell colour filters) are not yet
supported.
### Simple filters ### Simple filters
In MS Excel, Simple Filters are a dropdown list of all values used in that column, and the user can select which ones they want to display and which ones they want to hide by ticking and unticking the checkboxes alongside each option. When the filter is applied, rows containing the checked entries will be displayed, rows that don't contain those values will be hidden. In MS Excel, Simple Filters are a dropdown list of all values used in
that column, and the user can select which ones they want to display and
which ones they want to hide by ticking and unticking the checkboxes
alongside each option. When the filter is applied, rows containing the
checked entries will be displayed, rows that don't contain those values
will be hidden.
![04-01-simple-autofilter.png](./images/04-01-simple-autofilter.png "") ![04-01-simple-autofilter.png](./images/04-01-simple-autofilter.png)
To create a filter expression, we need to start by identifying the filter type. In this case, we're just going to specify that this filter is a standard filter. To create a filter expression, we need to start by identifying the
filter type. In this case, we're just going to specify that this filter
is a standard filter.
```php ``` php
$columnFilter->setFilterType( $columnFilter->setFilterType(
\PhpOffice\PhpSpreadsheet\Worksheet\AutoFilter\Column::AUTOFILTER_FILTERTYPE_FILTER \PhpOffice\PhpSpreadsheet\Worksheet\AutoFilter\Column::AUTOFILTER_FILTERTYPE_FILTER
); );
``` ```
Now we've identified the filter type, we can create a filter rule and set the filter values: Now we've identified the filter type, we can create a filter rule and
set the filter values:
When creating a simple filter in PhpSpreadsheet, you only need to specify the values for "checked" columns: you do this by creating a filter rule for each value. When creating a simple filter in PhpSpreadsheet, you only need to
specify the values for "checked" columns: you do this by creating a
filter rule for each value.
```php ``` php
$columnFilter->createRule() $columnFilter->createRule()
->setRule( ->setRule(
\PhpOffice\PhpSpreadsheet\Worksheet\AutoFilter\Column\Rule::AUTOFILTER_COLUMN_RULE_EQUAL, \PhpOffice\PhpSpreadsheet\Worksheet\AutoFilter\Column\Rule::AUTOFILTER_COLUMN_RULE_EQUAL,
@ -99,15 +141,18 @@ $columnFilter->createRule()
); );
``` ```
This creates two filter rules: the column will be filtered by values that match “France” OR “Germany”. For Simple Filters, you can create as many rules as you want This creates two filter rules: the column will be filtered by values
that match “France” OR “Germany”. For Simple Filters, you can create as
many rules as you want
Simple filters are always a comparison match of EQUALS, and multiple standard filters are always treated as being joined by an OR condition. Simple filters are always a comparison match of EQUALS, and multiple
standard filters are always treated as being joined by an OR condition.
#### Matching Blanks #### Matching Blanks
If you want to create a filter to select blank cells, you would use: If you want to create a filter to select blank cells, you would use:
```php ``` php
$columnFilter->createRule() $columnFilter->createRule()
->setRule( ->setRule(
\PhpOffice\PhpSpreadsheet\Worksheet\AutoFilter\Column\Rule::AUTOFILTER_COLUMN_RULE_EQUAL, \PhpOffice\PhpSpreadsheet\Worksheet\AutoFilter\Column\Rule::AUTOFILTER_COLUMN_RULE_EQUAL,
@ -117,21 +162,26 @@ $columnFilter->createRule()
### DateGroup Filters ### DateGroup Filters
In MS Excel, DateGroup filters provide a series of dropdown filter selectors for date values, so you can specify entire years, or months within a year, or individual days within each month. In MS Excel, DateGroup filters provide a series of dropdown filter
selectors for date values, so you can specify entire years, or months
within a year, or individual days within each month.
![04-02-dategroup-autofilter.png](./images/04-02-dategroup-autofilter.png "") ![04-02-dategroup-autofilter.png](./images/04-02-dategroup-autofilter.png)
DateGroup filters are still applied as a Standard Filter type. DateGroup filters are still applied as a Standard Filter type.
```php ``` php
$columnFilter->setFilterType( $columnFilter->setFilterType(
\PhpOffice\PhpSpreadsheet\Worksheet\AutoFilter\Column::AUTOFILTER_FILTERTYPE_FILTER \PhpOffice\PhpSpreadsheet\Worksheet\AutoFilter\Column::AUTOFILTER_FILTERTYPE_FILTER
); );
``` ```
Creating a dateGroup filter in PhpSpreadsheet, you specify the values for "checked" columns as an associative array of year. month, day, hour minute and second. To select a year and month, you need to create a DateGroup rule identifying the selected year and month: Creating a dateGroup filter in PhpSpreadsheet, you specify the values
for "checked" columns as an associative array of year. month, day, hour
minute and second. To select a year and month, you need to create a
DateGroup rule identifying the selected year and month:
```php ``` php
$columnFilter->createRule() $columnFilter->createRule()
->setRule( ->setRule(
\PhpOffice\PhpSpreadsheet\Worksheet\AutoFilter\Column\Rule::AUTOFILTER_COLUMN_RULE_EQUAL, \PhpOffice\PhpSpreadsheet\Worksheet\AutoFilter\Column\Rule::AUTOFILTER_COLUMN_RULE_EQUAL,
@ -147,31 +197,39 @@ $columnFilter->createRule()
The key values for the associative array are: The key values for the associative array are:
- year - year
- month - month
- day - day
- hour - hour
- minute - minute
- second - second
Like Standard filters, DateGroup filters are always a match of EQUALS, and multiple standard filters are always treated as being joined by an OR condition. Like Standard filters, DateGroup filters are always a match of EQUALS,
and multiple standard filters are always treated as being joined by an
Note that we alse specify a ruleType: to differentiate this from a standard filter, we explicitly set the Rule's Type to AUTOFILTER_RULETYPE_DATEGROUP. As with standard filters, we can create any number of DateGroup Filters. OR condition.
Note that we alse specify a ruleType: to differentiate this from a
standard filter, we explicitly set the Rule's Type to
AUTOFILTER\_RULETYPE\_DATEGROUP. As with standard filters, we can create
any number of DateGroup Filters.
### Custom filters ### Custom filters
In MS Excel, Custom filters allow us to select more complex conditions using an operator as well as a value. Typical examples might be values that fall within a range (e.g. between -20 and +20), or text values with wildcards (e.g. beginning with the letter U). To handle this, they In MS Excel, Custom filters allow us to select more complex conditions
using an operator as well as a value. Typical examples might be values
that fall within a range (e.g. between -20 and +20), or text values with
wildcards (e.g. beginning with the letter U). To handle this, they
![04-03-custom-autofilter-1.png](./images/04-03-custom-autofilter-1.png "") ![04-03-custom-autofilter-1.png](./images/04-03-custom-autofilter-1.png)
![04-03-custom-autofilter-2.png](./images/04-03-custom-autofilter-2.png "") ![04-03-custom-autofilter-2.png](./images/04-03-custom-autofilter-2.png)
Custom filters are limited to 2 rules, and these can be joined using either an AND or an OR. Custom filters are limited to 2 rules, and these can be joined using
either an AND or an OR.
We start by specifying a Filter type, this time a CUSTOMFILTER. We start by specifying a Filter type, this time a CUSTOMFILTER.
```php ``` php
$columnFilter->setFilterType( $columnFilter->setFilterType(
\PhpOffice\PhpSpreadsheet\Worksheet\AutoFilter\Column::AUTOFILTER_FILTERTYPE_CUSTOMFILTER \PhpOffice\PhpSpreadsheet\Worksheet\AutoFilter\Column::AUTOFILTER_FILTERTYPE_CUSTOMFILTER
); );
@ -179,9 +237,10 @@ $columnFilter->setFilterType(
And then define our rules. And then define our rules.
The following shows a simple wildcard filter to show all column entries beginning with the letter 'U'. The following shows a simple wildcard filter to show all column entries
beginning with the letter 'U'.
```php ``` php
$columnFilter->createRule() $columnFilter->createRule()
->setRule( ->setRule(
\PhpOffice\PhpSpreadsheet\Worksheet\AutoFilter\Column\Rule::AUTOFILTER_COLUMN_RULE_EQUAL, \PhpOffice\PhpSpreadsheet\Worksheet\AutoFilter\Column\Rule::AUTOFILTER_COLUMN_RULE_EQUAL,
@ -192,13 +251,20 @@ $columnFilter->createRule()
); );
``` ```
MS Excel uses \* as a wildcard to match any number of characters, and ? as a wildcard to match a single character. 'U\*' equates to "begins with a 'U'"; '\*U' equates to "ends with a 'U'"; and '\*U\*' equates to "contains a 'U'" MS Excel uses \* as a wildcard to match any number of characters, and ?
as a wildcard to match a single character. 'U\*' equates to "begins with
a 'U'"; '\*U' equates to "ends with a 'U'"; and '\*U\*' equates to
"contains a 'U'"
If you want to match explicitly against a \* or a ? character, you can escape it with a tilde (~), so ?~\*\* would explicitly match for a \* character as the second character in the cell value, followed by any number of other characters. The only other character that needs escaping is the ~ itself. If you want to match explicitly against a \* or a ? character, you can
escape it with a tilde (\~), so ?\~\*\* would explicitly match for a \*
character as the second character in the cell value, followed by any
number of other characters. The only other character that needs escaping
is the \~ itself.
To create a "between" condition, we need to define two rules: To create a "between" condition, we need to define two rules:
```php ``` php
$columnFilter->createRule() $columnFilter->createRule()
->setRule( ->setRule(
\PhpOffice\PhpSpreadsheet\Worksheet\AutoFilter\Column\Rule::AUTOFILTER_COLUMN_RULE_GREATERTHANOREQUAL, \PhpOffice\PhpSpreadsheet\Worksheet\AutoFilter\Column\Rule::AUTOFILTER_COLUMN_RULE_GREATERTHANOREQUAL,
@ -219,15 +285,19 @@ $columnFilter->createRule()
We also set the rule type to CUSTOMFILTER. We also set the rule type to CUSTOMFILTER.
This defined two rules, filtering numbers that are >= -20 OR <= 20, so we also need to modify the join condition to reflect AND rather than OR. This defined two rules, filtering numbers that are &gt;= -20 OR &lt;=
20, so we also need to modify the join condition to reflect AND rather
than OR.
```php ``` php
$columnFilter->setAndOr( $columnFilter->setAndOr(
\PhpOffice\PhpSpreadsheet\Worksheet\AutoFilter\Column::AUTOFILTER_COLUMN_ANDOR_AND \PhpOffice\PhpSpreadsheet\Worksheet\AutoFilter\Column::AUTOFILTER_COLUMN_ANDOR_AND
); );
``` ```
The valid set of operators for Custom Filters are defined in the \PhpOffice\PhpSpreadsheet\Worksheet\AutoFilter\Column\Rule class, and comprise: The valid set of operators for Custom Filters are defined in the
\PhpOffice\PhpSpreadsheet\Worksheet\AutoFilter\Column\Rule class, and
comprise:
Operator Constant | Value Operator Constant | Value
------------------------------------------|---------------------- ------------------------------------------|----------------------
@ -238,24 +308,29 @@ AUTOFILTER_COLUMN_RULE_GREATERTHANOREQUAL | 'greaterThanOrEqual'
AUTOFILTER_COLUMN_RULE_LESSTHAN | 'lessThan' AUTOFILTER_COLUMN_RULE_LESSTHAN | 'lessThan'
AUTOFILTER_COLUMN_RULE_LESSTHANOREQUAL | 'lessThanOrEqual' AUTOFILTER_COLUMN_RULE_LESSTHANOREQUAL | 'lessThanOrEqual'
### Dynamic Filters ### Dynamic Filters
Dynamic Filters are based on a dynamic comparison condition, where the value we're comparing against the cell values is variable, such as 'today'; or when we're testing against an aggregate of the cell data (e.g. 'aboveAverage'). Only a single dynamic filter can be applied to a column at a time. Dynamic Filters are based on a dynamic comparison condition, where the
value we're comparing against the cell values is variable, such as
'today'; or when we're testing against an aggregate of the cell data
(e.g. 'aboveAverage'). Only a single dynamic filter can be applied to a
column at a time.
![04-04-dynamic-autofilter.png](./images/04-04-dynamic-autofilter.png "") ![04-04-dynamic-autofilter.png](./images/04-04-dynamic-autofilter.png)
Again, we start by specifying a Filter type, this time a DYNAMICFILTER. Again, we start by specifying a Filter type, this time a DYNAMICFILTER.
```php ``` php
$columnFilter->setFilterType( $columnFilter->setFilterType(
\PhpOffice\PhpSpreadsheet\Worksheet\AutoFilter\Column::AUTOFILTER_FILTERTYPE_DYNAMICFILTER \PhpOffice\PhpSpreadsheet\Worksheet\AutoFilter\Column::AUTOFILTER_FILTERTYPE_DYNAMICFILTER
); );
``` ```
When defining the rule for a dynamic filter, we don't define a value (we can simply set that to NULL) but we do specify the dynamic filter category. When defining the rule for a dynamic filter, we don't define a value (we
can simply set that to NULL) but we do specify the dynamic filter
category.
```php ``` php
$columnFilter->createRule() $columnFilter->createRule()
->setRule( ->setRule(
\PhpOffice\PhpSpreadsheet\Worksheet\AutoFilter\Column\Rule::AUTOFILTER_COLUMN_RULE_EQUAL, \PhpOffice\PhpSpreadsheet\Worksheet\AutoFilter\Column\Rule::AUTOFILTER_COLUMN_RULE_EQUAL,
@ -269,7 +344,9 @@ $columnFilter->createRule()
We also set the rule type to DYNAMICFILTER. We also set the rule type to DYNAMICFILTER.
The valid set of dynamic filter categories is defined in the \PhpOffice\PhpSpreadsheet\Worksheet\AutoFilter\Column\Rule class, and comprises: The valid set of dynamic filter categories is defined in the
\PhpOffice\PhpSpreadsheet\Worksheet\AutoFilter\Column\Rule class, and
comprises:
Operator Constant | Value Operator Constant | Value
-----------------------------------------|---------------- -----------------------------------------|----------------
@ -322,22 +399,28 @@ AUTOFILTER_RULETYPE_DYNAMIC_BELOWAVERAGE | 'belowAverage'
We can only apply a single Dynamic Filter rule to a column at a time. We can only apply a single Dynamic Filter rule to a column at a time.
### Top Ten Filters ### Top Ten Filters
Top Ten Filters are similar to Dynamic Filters in that they are based on a summarisation of the actual data values in the cells. However, unlike Dynamic Filters where you can only select a single option, Top Ten Filters allow you to select based on a number of criteria: Top Ten Filters are similar to Dynamic Filters in that they are based on
a summarisation of the actual data values in the cells. However, unlike
Dynamic Filters where you can only select a single option, Top Ten
Filters allow you to select based on a number of criteria:
![04-05-custom-topten-1.png](./images/04-05-topten-autofilter-1.png "") ![04-05-custom-topten-1.png](./images/04-05-topten-autofilter-1.png)
![04-05-custom-topten-2.png](./images/04-05-topten-autofilter-2.png "") ![04-05-custom-topten-2.png](./images/04-05-topten-autofilter-2.png)
You can identify whether you want the top (highest) or bottom (lowest) values.You can identify how many values you wish to select in the filterYou can identify whether this should be a percentage or a number of items. You can identify whether you want the top (highest) or bottom (lowest)
values.You can identify how many values you wish to select in the
filterYou can identify whether this should be a percentage or a number
of items.
Like Dynamic Filters, only a single Top Ten filter can be applied to a column at a time. Like Dynamic Filters, only a single Top Ten filter can be applied to a
column at a time.
We start by specifying a Filter type, this time a DYNAMICFILTER. We start by specifying a Filter type, this time a DYNAMICFILTER.
```php ``` php
$columnFilter->setFilterType( $columnFilter->setFilterType(
\PhpOffice\PhpSpreadsheet\Worksheet\AutoFilter\Column::AUTOFILTER_FILTERTYPE_TOPTENFILTER \PhpOffice\PhpSpreadsheet\Worksheet\AutoFilter\Column::AUTOFILTER_FILTERTYPE_TOPTENFILTER
); );
@ -345,7 +428,7 @@ $columnFilter->setFilterType(
Then we create the rule: Then we create the rule:
```php ``` php
$columnFilter->createRule() $columnFilter->createRule()
->setRule( ->setRule(
\PhpOffice\PhpSpreadsheet\Worksheet\AutoFilter\Column\Rule::AUTOFILTER_COLUMN_RULE_TOPTEN_PERCENT, \PhpOffice\PhpSpreadsheet\Worksheet\AutoFilter\Column\Rule::AUTOFILTER_COLUMN_RULE_TOPTEN_PERCENT,
@ -361,7 +444,7 @@ This will filter the Top 5 percent of values in the column.
To specify the lowest (bottom 2 values), we would specify a rule of: To specify the lowest (bottom 2 values), we would specify a rule of:
```php ``` php
$columnFilter->createRule() $columnFilter->createRule()
->setRule( ->setRule(
\PhpOffice\PhpSpreadsheet\Worksheet\AutoFilter\Column\Rule::AUTOFILTER_COLUMN_RULE_TOPTEN_BY_VALUE, \PhpOffice\PhpSpreadsheet\Worksheet\AutoFilter\Column\Rule::AUTOFILTER_COLUMN_RULE_TOPTEN_BY_VALUE,
@ -373,7 +456,10 @@ $columnFilter->createRule()
); );
``` ```
The option values for TopTen Filters top/bottom value/percent are all defined in the \PhpOffice\PhpSpreadsheet\Worksheet\AutoFilter\Column\Rule class, and comprise: The option values for TopTen Filters top/bottom value/percent are all
defined in the
\PhpOffice\PhpSpreadsheet\Worksheet\AutoFilter\Column\Rule class, and
comprise:
Operator Constant | Value Operator Constant | Value
---------------------------------------|------------- ---------------------------------------|-------------
@ -389,26 +475,37 @@ AUTOFILTER_COLUMN_RULE_TOPTEN_BOTTOM | 'bottom'
## Executing an AutoFilter ## Executing an AutoFilter
When an autofilter is applied in MS Excel, it sets the row hidden/visible flags for each row of the autofilter area based on the selected criteria, so that only those rows that match the filter criteria are displayed. When an autofilter is applied in MS Excel, it sets the row
hidden/visible flags for each row of the autofilter area based on the
selected criteria, so that only those rows that match the filter
criteria are displayed.
PhpSpreadsheet will not execute the equivalent function automatically when you set or change a filter expression, but only when the file is saved. PhpSpreadsheet will not execute the equivalent function automatically
when you set or change a filter expression, but only when the file is
saved.
### Applying the Filter ### Applying the Filter
If you wish to execute your filter from within a script, you need to do this manually. You can do this using the autofilters showHideRows() method. If you wish to execute your filter from within a script, you need to do
this manually. You can do this using the autofilters showHideRows()
method.
```php ``` php
$autoFilter = $spreadsheet->getActiveSheet()->getAutoFilter(); $autoFilter = $spreadsheet->getActiveSheet()->getAutoFilter();
$autoFilter->showHideRows(); $autoFilter->showHideRows();
``` ```
This will set all rows that match the filter criteria to visible, while hiding all other rows within the autofilter area. This will set all rows that match the filter criteria to visible, while
hiding all other rows within the autofilter area.
### Displaying Filtered Rows ### Displaying Filtered Rows
Simply looping through the rows in an autofilter area will still access ever row, whether it matches the filter criteria or not. To selectively access only the filtered rows, you need to test each rows visibility settings. Simply looping through the rows in an autofilter area will still access
ever row, whether it matches the filter criteria or not. To selectively
access only the filtered rows, you need to test each rows visibility
settings.
```php ``` php
foreach ($spreadsheet->getActiveSheet()->getRowIterator() as $row) { foreach ($spreadsheet->getActiveSheet()->getRowIterator() as $row) {
if ($spreadsheet->getActiveSheet() if ($spreadsheet->getActiveSheet()
->getRowDimension($row->getRowIndex())->getVisible()) { ->getRowDimension($row->getRowIndex())->getVisible()) {
@ -429,5 +526,5 @@ foreach ($spreadsheet->getActiveSheet()->getRowIterator() as $row) {
## AutoFilter Sorting ## AutoFilter Sorting
In MS Excel, Autofiltering also allows the rows to be sorted. This feature is ***not*** supported by PhpSpreadsheet. In MS Excel, Autofiltering also allows the rows to be sorted. This
feature is ***not*** supported by PhpSpreadsheet.

File diff suppressed because it is too large Load Diff

View File

@ -1,49 +1,126 @@
# File Formats # File Formats
PhpSpreadsheet can read a number of different spreadsheet and file formats, although not all features are supported by all of the readers. Check the [features cross reference](../references/features-cross-reference.md) for a list that identifies which features are supported by which readers. PhpSpreadsheet can read a number of different spreadsheet and file
formats, although not all features are supported by all of the readers.
Check the [features cross
reference](../references/features-cross-reference.md) for a list that
identifies which features are supported by which readers.
Currently, PhpSpreadsheet supports the following File Types for Reading: Currently, PhpSpreadsheet supports the following File Types for Reading:
### Xls ### Xls
The Microsoft Excel™ Binary file format (BIFF5 and BIFF8) is a binary file format that was used by Microsoft Excel™ between versions 95 and 2003. The format is supported (to various extents) by most spreadsheet programs. BIFF files normally have an extension of .xls. Documentation describing the format can be found online at [http://msdn.microsoft.com/en-us/library/cc313154(v=office.12).aspx](http://msdn.microsoft.com/en-us/library/cc313154(v=office.12).aspx) or from [as a downloadable PDF](http://download.microsoft.com/download/2/4/8/24862317-78F0-4C4B-B355-C7B2C1D997DB/[MS-XLS].pdf). The Microsoft Excel™ Binary file format (BIFF5 and BIFF8) is a binary
file format that was used by Microsoft Excel™ between versions 95 and
2003. The format is supported (to various extents) by most spreadsheet
programs. BIFF files normally have an extension of .xls. Documentation
describing the format can be found online at
<http://msdn.microsoft.com/en-us/library/cc313154(v=office.12).aspx> or
from [as a downloadable
PDF](http://download.microsoft.com/download/2/4/8/24862317-78F0-4C4B-B355-C7B2C1D997DB/%5BMS-XLS%5D.pdf).
### Excel2003XML ### Excel2003XML
Microsoft Excel™ 2003 included options for a file format called SpreadsheetML. This file is a zipped XML document. It is not very common, but its core features are supported. Documentation for the format can be found at [http://msdn.microsoft.com/en-us/library/aa140066%28office.10%29.aspx](http://msdn.microsoft.com/en-us/library/aa140066%28office.10%29.aspx) though its sadly rather sparse in its detail. Microsoft Excel™ 2003 included options for a file format called
SpreadsheetML. This file is a zipped XML document. It is not very
common, but its core features are supported. Documentation for the
format can be found at
<http://msdn.microsoft.com/en-us/library/aa140066%28office.10%29.aspx>
though its sadly rather sparse in its detail.
### Xlsx ### Xlsx
Microsoft Excel™ 2007 shipped with a new file format, namely Microsoft Office Open XML SpreadsheetML, and Excel 2010 extended this still further with its new features such as sparklines. These files typically have an extension of .xlsx. This format is based around a zipped collection of eXtensible Markup Language (XML) files. Microsoft Office Open XML SpreadsheetML is mostly standardized in ECMA 376 ([http://www.ecma-international.org/news/TC45_current_work/TC45_available_docs.htm](http://www.ecma-international.org/news/TC45_current_work/TC45_available_docs.htm)) and ISO 29500. Microsoft Excel™ 2007 shipped with a new file format, namely Microsoft
Office Open XML SpreadsheetML, and Excel 2010 extended this still
further with its new features such as sparklines. These files typically
have an extension of .xlsx. This format is based around a zipped
collection of eXtensible Markup Language (XML) files. Microsoft Office
Open XML SpreadsheetML is mostly standardized in ECMA 376
(<http://www.ecma-international.org/news/TC45_current_work/TC45_available_docs.htm>)
and ISO 29500.
### Ods ### Ods
aka Open Document Format (ODF) or OASIS, this is the OpenOffice.org XML File Format for spreadsheets. It comprises a zip archive including several components all of which are text files, most of these with markup in the eXtensible Markup Language (XML). It is the standard file format for OpenOffice.org Calc and StarCalc, and files typically have an extension of .ods. The published specification for the file format is available from the OASIS Open Office XML Format Technical Committee web page ([http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=office#technical](http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=office#technical)). Other information is available from the OpenOffice.org XML File Format web page ([http://xml.openoffice.org/general.html](http://xml.openoffice.org/general.html)), part of the OpenOffice.org project. aka Open Document Format (ODF) or OASIS, this is the OpenOffice.org XML
File Format for spreadsheets. It comprises a zip archive including
several components all of which are text files, most of these with
markup in the eXtensible Markup Language (XML). It is the standard file
format for OpenOffice.org Calc and StarCalc, and files typically have an
extension of .ods. The published specification for the file format is
available from the OASIS Open Office XML Format Technical Committee web
page
(<http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=office#technical>).
Other information is available from the OpenOffice.org XML File Format
web page (<http://xml.openoffice.org/general.html>), part of the
OpenOffice.org project.
### SYLK ### SYLK
This is the Microsoft Multiplan Symbolic Link Interchange (SYLK) file format. Multiplan was a predecessor to Microsoft Excel™. Files normally have an extension of .slk. While not common, there are still a few applications that generate SYLK files as a cross-platform option, because (despite being limited to a single worksheet) it is a simple format to implement, and supports some basic data and cell formatting options (unlike CSV files). This is the Microsoft Multiplan Symbolic Link Interchange (SYLK) file
format. Multiplan was a predecessor to Microsoft Excel™. Files normally
have an extension of .slk. While not common, there are still a few
applications that generate SYLK files as a cross-platform option,
because (despite being limited to a single worksheet) it is a simple
format to implement, and supports some basic data and cell formatting
options (unlike CSV files).
### Gnumeric ### Gnumeric
The Gnumeric file format is used by the Gnome Gnumeric spreadsheet application, and typically files have an extension of .gnumeric. The file contents are stored using eXtensible Markup Language (XML) markup, and the file is then compressed using the GNU project's gzip compression library. [http://projects.gnome.org/gnumeric/doc/file-format-gnumeric.shtml](http://projects.gnome.org/gnumeric/doc/file-format-gnumeric.shtml) The Gnumeric file format is used by the Gnome Gnumeric spreadsheet
application, and typically files have an extension of .gnumeric. The
file contents are stored using eXtensible Markup Language (XML) markup,
and the file is then compressed using the GNU project's gzip compression
library.
<http://projects.gnome.org/gnumeric/doc/file-format-gnumeric.shtml>
### CSV ### CSV
Comma Separated Value (CSV) file format is a common structuring strategy for text format files. In CSV flies, each line in the file represents a row of data and (within each line of the file) the different data fields (or columns) are separated from one another using a comma (","). If a data field contains a comma, then it should be enclosed (typically in quotation marks ("). Sometimes tabs "\t", or the pipe symbol ("|"), or a semi-colon (";") are used as separators instead of a comma, although other symbols can be used. Because CSV is a text-only format, it doesn't support any data formatting options. Comma Separated Value (CSV) file format is a common structuring strategy
for text format files. In CSV flies, each line in the file represents a
row of data and (within each line of the file) the different data fields
(or columns) are separated from one another using a comma (","). If a
data field contains a comma, then it should be enclosed (typically in
quotation marks ("). Sometimes tabs "\t", or the pipe symbol ("|"), or a
semi-colon (";") are used as separators instead of a comma, although
other symbols can be used. Because CSV is a text-only format, it doesn't
support any data formatting options.
"CSV" is not a single, well-defined format (although see RFC 4180 for one definition that is commonly used). Rather, in practice the term "CSV" refers to any file that: "CSV" is not a single, well-defined format (although see RFC 4180 for
one definition that is commonly used). Rather, in practice the term
"CSV" refers to any file that:
- is plain text using a character set such as ASCII, Unicode, EBCDIC, or Shift JIS, - is plain text using a character set such as ASCII, Unicode, EBCDIC,
- consists of records (typically one record per line), or Shift JIS,
- with the records divided into fields separated by delimiters (typically a single reserved character such as comma, semicolon, or tab, - consists of records (typically one record per line),
- where every record has the same sequence of fields. - with the records divided into fields separated by delimiters
(typically a single reserved character such as comma, semicolon, or
tab,
- where every record has the same sequence of fields.
Within these general constraints, many variations are in use. Therefore "CSV" files are not entirely portable. Nevertheless, the variations are fairly small, and many implementations allow users to glance at the file (which is feasible because it is plain text), and then specify the delimiter character(s), quoting rules, etc. Within these general constraints, many variations are in use. Therefore
"CSV" files are not entirely portable. Nevertheless, the variations are
fairly small, and many implementations allow users to glance at the file
(which is feasible because it is plain text), and then specify the
delimiter character(s), quoting rules, etc.
**Warning:** Microsoft Excel™ will open .csv files, but depending on the system's regional settings, it may expect a semicolon as a separator instead of a comma, since in some languages the comma is used as the decimal separator. Also, many regional versions of Excel will not be able to deal with Unicode characters in a CSV file. **Warning:** Microsoft Excel™ will open .csv files, but depending on the
system's regional settings, it may expect a semicolon as a separator
instead of a comma, since in some languages the comma is used as the
decimal separator. Also, many regional versions of Excel will not be
able to deal with Unicode characters in a CSV file.
### HTML ### HTML
HyperText Markup Language (HTML) is the main markup language for creating web pages and other information that can be displayed in a web browser. Files typically have an extension of .html or .htm. HTML markup provides a means to create structured documents by denoting structural semantics for text such as headings, paragraphs, lists, links, quotes and other items. Since 1996, the HTML specifications have been maintained, with input from commercial software vendors, by the World Wide Web Consortium (W3C). However, in 2000, HTML also became an international standard (ISO/IEC 15445:2000). HTML 4.01 was published in late 1999, with further errata published through 2001. In 2004 development began on HTML5 in the Web Hypertext Application Technology Working Group (WHATWG), which became a joint deliverable with the W3C in 2008. HyperText Markup Language (HTML) is the main markup language for
creating web pages and other information that can be displayed in a web
browser. Files typically have an extension of .html or .htm. HTML markup
provides a means to create structured documents by denoting structural
semantics for text such as headings, paragraphs, lists, links, quotes
and other items. Since 1996, the HTML specifications have been
maintained, with input from commercial software vendors, by the World
Wide Web Consortium (W3C). However, in 2000, HTML also became an
international standard (ISO/IEC 15445:2000). HTML 4.01 was published in
late 1999, with further errata published through 2001. In 2004
development began on HTML5 in the Web Hypertext Application Technology
Working Group (WHATWG), which became a joint deliverable with the W3C in
2008.

View File

@ -1,17 +1,18 @@
# Migration from PHPExcel # Migration from PHPExcel
PhpSpreadsheet introduced many breaking changes by introducing namespaces and PhpSpreadsheet introduced many breaking changes by introducing
renaming some classes. To help you migrate existing project, a tool was written namespaces and renaming some classes. To help you migrate existing
to replace all references to PHPExcel classes to their new names. project, a tool was written to replace all references to PHPExcel
classes to their new names.
The tool is included in PhpSpreadsheet. It scans recursively all files and The tool is included in PhpSpreadsheet. It scans recursively all files
directories, starting from the current directory. Assuming it was installed with and directories, starting from the current directory. Assuming it was
composer, it can be run like so: installed with composer, it can be run like so:
```sh ``` sh
cd /project/to/migrate/src cd /project/to/migrate/src
/project/to/migrate/vendor/phpoffice/phpspreadsheet/bin/migrate-from-phpexcel /project/to/migrate/vendor/phpoffice/phpspreadsheet/bin/migrate-from-phpexcel
``` ```
**Important** The tool will irreversibly modify your sources, be sure to backup **Important** The tool will irreversibly modify your sources, be sure to
everything, and double check the result before committing. backup everything, and double check the result before committing.

View File

@ -1,41 +1,64 @@
# Reading Files # Reading Files
## Security ## Security
XML-based formats such as OfficeOpen XML, Excel2003 XML, OASIS and Gnumeric are susceptible to XML External Entity Processing (XXE) injection attacks (for an explanation of XXE injection see http://websec.io/2012/08/27/Preventing-XEE-in-PHP.html) when reading spreadsheet files. This can lead to: XML-based formats such as OfficeOpen XML, Excel2003 XML, OASIS and
Gnumeric are susceptible to XML External Entity Processing (XXE)
injection attacks (for an explanation of XXE injection see
http://websec.io/2012/08/27/Preventing-XEE-in-PHP.html) when reading
spreadsheet files. This can lead to:
- Disclosure whether a file is existent - Disclosure whether a file is existent
- Server Side Request Forgery - Server Side Request Forgery
- Command Execution (depending on the installed PHP wrappers) - Command Execution (depending on the installed PHP wrappers)
To prevent this, PhpSpreadsheet sets `libxml_disable_entity_loader` to
To prevent this, PhpSpreadsheet sets `libxml_disable_entity_loader` to `true` for the XML-based Readers by default. `true` for the XML-based Readers by default.
## Loading a Spreadsheet File ## Loading a Spreadsheet File
The simplest way to load a workbook file is to let PhpSpreadsheet's IO Factory identify the file type and load it, calling the static load() method of the \PhpOffice\PhpSpreadsheet\IOFactory class. The simplest way to load a workbook file is to let PhpSpreadsheet's IO
Factory identify the file type and load it, calling the static load()
method of the \PhpOffice\PhpSpreadsheet\IOFactory class.
```php ``` php
$inputFileName = './sampleData/example1.xls'; $inputFileName = './sampleData/example1.xls';
/** Load $inputFileName to a Spreadsheet Object **/ /** Load $inputFileName to a Spreadsheet Object **/
$spreadsheet = \PhpOffice\PhpSpreadsheet\IOFactory::load($inputFileName); $spreadsheet = \PhpOffice\PhpSpreadsheet\IOFactory::load($inputFileName);
``` ```
> See Examples/Reader/exampleReader01.php for a working example of this code.
The load() method will attempt to identify the file type, and instantiate a loader for that file type; using it to load the file and store the data and any formatting in a `Spreadsheet` object. > See Examples/Reader/exampleReader01.php for a working example of this
> code.
The method makes an initial guess at the loader to instantiate based on the file extension; but will test the file before actually executing the load: so if (for example) the file is actually a CSV file or contains HTML markup, but that has been given a .xls extension (quite a common practise), it will reject the Xls loader that it would normally use for a .xls file; and test the file using the other loaders until it finds the appropriate loader, and then use that to read the file. The load() method will attempt to identify the file type, and
instantiate a loader for that file type; using it to load the file and
store the data and any formatting in a `Spreadsheet` object.
While easy to implement in your code, and you don't need to worry about the file type; this isn't the most efficient method to load a file; and it lacks the flexibility to configure the loader in any way before actually reading the file into a `Spreadsheet` object. The method makes an initial guess at the loader to instantiate based on
the file extension; but will test the file before actually executing the
load: so if (for example) the file is actually a CSV file or contains
HTML markup, but that has been given a .xls extension (quite a common
practise), it will reject the Xls loader that it would normally use for
a .xls file; and test the file using the other loaders until it finds
the appropriate loader, and then use that to read the file.
While easy to implement in your code, and you don't need to worry about
the file type; this isn't the most efficient method to load a file; and
it lacks the flexibility to configure the loader in any way before
actually reading the file into a `Spreadsheet` object.
## Creating a Reader and Loading a Spreadsheet File ## Creating a Reader and Loading a Spreadsheet File
If you know the file type of the spreadsheet file that you need to load, you can instantiate a new reader object for that file type, then use the reader's load() method to read the file to a `Spreadsheet` object. It is possible to instantiate the reader objects for each of the different supported filetype by name. However, you may get unpredictable results if the file isn't of the right type (e.g. it is a CSV with an extension of .xls), although this type of exception should normally be trapped. If you know the file type of the spreadsheet file that you need to load,
you can instantiate a new reader object for that file type, then use the
reader's load() method to read the file to a `Spreadsheet` object. It is
possible to instantiate the reader objects for each of the different
supported filetype by name. However, you may get unpredictable results
if the file isn't of the right type (e.g. it is a CSV with an extension
of .xls), although this type of exception should normally be trapped.
```php ``` php
$inputFileName = './sampleData/example1.xls'; $inputFileName = './sampleData/example1.xls';
/** Create a new Xls Reader **/ /** Create a new Xls Reader **/
@ -49,11 +72,15 @@ $reader = new \PhpOffice\PhpSpreadsheet\Reader\Xls();
/** Load $inputFileName to a Spreadsheet Object **/ /** Load $inputFileName to a Spreadsheet Object **/
$spreadsheet = $reader->load($inputFileName); $spreadsheet = $reader->load($inputFileName);
``` ```
> See Examples/Reader/exampleReader02.php for a working example of this code.
Alternatively, you can use the IO Factory's createReader() method to instantiate the reader object for you, simply telling it the file type of the reader that you want instantiating. > See Examples/Reader/exampleReader02.php for a working example of this
> code.
```php Alternatively, you can use the IO Factory's createReader() method to
instantiate the reader object for you, simply telling it the file type
of the reader that you want instantiating.
``` php
$inputFileType = 'Xls'; $inputFileType = 'Xls';
// $inputFileType = 'Xlsx'; // $inputFileType = 'Xlsx';
// $inputFileType = 'Excel2003XML'; // $inputFileType = 'Excel2003XML';
@ -68,11 +95,15 @@ $reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
/** Load $inputFileName to a Spreadsheet Object **/ /** Load $inputFileName to a Spreadsheet Object **/
$spreadsheet = $reader->load($inputFileName); $spreadsheet = $reader->load($inputFileName);
``` ```
> See Examples/Reader/exampleReader03.php for a working example of this code.
If you're uncertain of the filetype, you can use the IO Factory's identify() method to identify the reader that you need, before using the createReader() method to instantiate the reader object. > See Examples/Reader/exampleReader03.php for a working example of this
> code.
```php If you're uncertain of the filetype, you can use the IO Factory's
identify() method to identify the reader that you need, before using the
createReader() method to instantiate the reader object.
``` php
$inputFileName = './sampleData/example1.xls'; $inputFileName = './sampleData/example1.xls';
/** Identify the type of $inputFileName **/ /** Identify the type of $inputFileName **/
@ -82,17 +113,24 @@ $reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
/** Load $inputFileName to a Spreadsheet Object **/ /** Load $inputFileName to a Spreadsheet Object **/
$spreadsheet = $reader->load($inputFileName); $spreadsheet = $reader->load($inputFileName);
``` ```
> See Examples/Reader/exampleReader04.php for a working example of this code.
> See Examples/Reader/exampleReader04.php for a working example of this
> code.
## Spreadsheet Reader Options ## Spreadsheet Reader Options
Once you have created a reader object for the workbook that you want to load, you have the opportunity to set additional options before executing the load() method. Once you have created a reader object for the workbook that you want to
load, you have the opportunity to set additional options before
executing the load() method.
### Reading Only Data from a Spreadsheet File ### Reading Only Data from a Spreadsheet File
If you're only interested in the cell values in a workbook, but don't need any of the cell formatting information, then you can set the reader to read only the data values and any formulae from each cell using the setReadDataOnly() method. If you're only interested in the cell values in a workbook, but don't
need any of the cell formatting information, then you can set the reader
to read only the data values and any formulae from each cell using the
setReadDataOnly() method.
```php ``` php
$inputFileType = 'Xls'; $inputFileType = 'Xls';
$inputFileName = './sampleData/example1.xls'; $inputFileName = './sampleData/example1.xls';
@ -103,11 +141,21 @@ $reader->setReadDataOnly(true);
/** Load $inputFileName to a Spreadsheet Object **/ /** Load $inputFileName to a Spreadsheet Object **/
$spreadsheet = $reader->load($inputFileName); $spreadsheet = $reader->load($inputFileName);
``` ```
> See Examples/Reader/exampleReader05.php for a working example of this code.
It is important to note that Workbooks (and PhpSpreadsheet) store dates and times as simple numeric values: they can only be distinguished from other numeric values by the format mask that is applied to that cell. When setting read data only to true, PhpSpreadsheet doesn't read the cell format masks, so it is not possible to differentiate between dates/times and numbers. > See Examples/Reader/exampleReader05.php for a working example of this
> code.
The Gnumeric loader has been written to read the format masks for date values even when read data only has been set to true, so it can differentiate between dates/times and numbers; but this change hasn't yet been implemented for the other readers. It is important to note that Workbooks (and PhpSpreadsheet) store dates
and times as simple numeric values: they can only be distinguished from
other numeric values by the format mask that is applied to that cell.
When setting read data only to true, PhpSpreadsheet doesn't read the
cell format masks, so it is not possible to differentiate between
dates/times and numbers.
The Gnumeric loader has been written to read the format masks for date
values even when read data only has been set to true, so it can
differentiate between dates/times and numbers; but this change hasn't
yet been implemented for the other readers.
Reading Only Data from a Spreadsheet File applies to Readers: Reading Only Data from a Spreadsheet File applies to Readers:
@ -119,11 +167,15 @@ CSV | NO | HTML | NO
### Reading Only Named WorkSheets from a File ### Reading Only Named WorkSheets from a File
If your workbook contains a number of worksheets, but you are only interested in reading some of those, then you can use the setLoadSheetsOnly() method to identify those sheets you are interested in reading. If your workbook contains a number of worksheets, but you are only
interested in reading some of those, then you can use the
setLoadSheetsOnly() method to identify those sheets you are interested
in reading.
To read a single sheet, you can pass that sheet name as a parameter to the setLoadSheetsOnly() method. To read a single sheet, you can pass that sheet name as a parameter to
the setLoadSheetsOnly() method.
```php ``` php
$inputFileType = 'Xls'; $inputFileType = 'Xls';
$inputFileName = './sampleData/example1.xls'; $inputFileName = './sampleData/example1.xls';
$sheetname = 'Data Sheet #2'; $sheetname = 'Data Sheet #2';
@ -135,11 +187,14 @@ $reader->setLoadSheetsOnly($sheetname);
/** Load $inputFileName to a Spreadsheet Object **/ /** Load $inputFileName to a Spreadsheet Object **/
$spreadsheet = $reader->load($inputFileName); $spreadsheet = $reader->load($inputFileName);
``` ```
> See Examples/Reader/exampleReader07.php for a working example of this code.
If you want to read more than just a single sheet, you can pass a list of sheet names as an array parameter to the setLoadSheetsOnly() method. > See Examples/Reader/exampleReader07.php for a working example of this
> code.
```php If you want to read more than just a single sheet, you can pass a list
of sheet names as an array parameter to the setLoadSheetsOnly() method.
``` php
$inputFileType = 'Xls'; $inputFileType = 'Xls';
$inputFileName = './sampleData/example1.xls'; $inputFileName = './sampleData/example1.xls';
$sheetnames = array('Data Sheet #1','Data Sheet #3'); $sheetnames = array('Data Sheet #1','Data Sheet #3');
@ -151,11 +206,14 @@ $reader->setLoadSheetsOnly($sheetnames);
/** Load $inputFileName to a Spreadsheet Object **/ /** Load $inputFileName to a Spreadsheet Object **/
$spreadsheet = $reader->load($inputFileName); $spreadsheet = $reader->load($inputFileName);
``` ```
> See Examples/Reader/exampleReader08.php for a working example of this code.
To reset this option to the default, you can call the setLoadAllSheets() method. > See Examples/Reader/exampleReader08.php for a working example of this
> code.
```php To reset this option to the default, you can call the setLoadAllSheets()
method.
``` php
$inputFileType = 'Xls'; $inputFileType = 'Xls';
$inputFileName = './sampleData/example1.xls'; $inputFileName = './sampleData/example1.xls';
@ -166,7 +224,9 @@ $reader->setLoadAllSheets();
/** Load $inputFileName to a Spreadsheet Object **/ /** Load $inputFileName to a Spreadsheet Object **/
$spreadsheet = $reader->load($inputFileName); $spreadsheet = $reader->load($inputFileName);
``` ```
> See Examples/Reader/exampleReader06.php for a working example of this code.
> See Examples/Reader/exampleReader06.php for a working example of this
> code.
Reading Only Named WorkSheets from a File applies to Readers: Reading Only Named WorkSheets from a File applies to Readers:
@ -178,9 +238,16 @@ CSV | NO | HTML | NO
### Reading Only Specific Columns and Rows from a File (Read Filters) ### Reading Only Specific Columns and Rows from a File (Read Filters)
If you are only interested in reading part of a worksheet, then you can write a filter class that identifies whether or not individual cells should be read by the loader. A read filter must implement the \PhpOffice\PhpSpreadsheet\Reader\IReadFilter interface, and contain a readCell() method that accepts arguments of $column, $row and $worksheetName, and return a boolean true or false that indicates whether a workbook cell identified by those arguments should be read or not. If you are only interested in reading part of a worksheet, then you can
write a filter class that identifies whether or not individual cells
should be read by the loader. A read filter must implement the
\PhpOffice\PhpSpreadsheet\Reader\IReadFilter interface, and contain a
readCell() method that accepts arguments of \$column, \$row and
\$worksheetName, and return a boolean true or false that indicates
whether a workbook cell identified by those arguments should be read or
not.
```php ``` php
$inputFileType = 'Xls'; $inputFileType = 'Xls';
$inputFileName = './sampleData/example1.xls'; $inputFileName = './sampleData/example1.xls';
$sheetname = 'Data Sheet #3'; $sheetname = 'Data Sheet #3';
@ -210,11 +277,16 @@ $reader->setReadFilter($filterSubset);
/** Load only the rows and columns that match our filter to Spreadsheet **/ /** Load only the rows and columns that match our filter to Spreadsheet **/
$spreadsheet = $reader->load($inputFileName); $spreadsheet = $reader->load($inputFileName);
``` ```
> See Examples/Reader/exampleReader09.php for a working example of this code.
This example is not particularly useful, because it can only be used in a very specific circumstance (when you only want cells in the range A1:E7 from your worksheet. A generic Read Filter would probably be more useful: > See Examples/Reader/exampleReader09.php for a working example of this
> code.
```php This example is not particularly useful, because it can only be used in
a very specific circumstance (when you only want cells in the range
A1:E7 from your worksheet. A generic Read Filter would probably be more
useful:
``` php
/** Define a Read Filter class implementing \PhpOffice\PhpSpreadsheet\Reader\IReadFilter */ /** Define a Read Filter class implementing \PhpOffice\PhpSpreadsheet\Reader\IReadFilter */
class MyReadFilter implements \PhpOffice\PhpSpreadsheet\Reader\IReadFilter class MyReadFilter implements \PhpOffice\PhpSpreadsheet\Reader\IReadFilter
{ {
@ -243,11 +315,16 @@ class MyReadFilter implements \PhpOffice\PhpSpreadsheet\Reader\IReadFilter
/** Create an Instance of our Read Filter, passing in the cell range **/ /** Create an Instance of our Read Filter, passing in the cell range **/
$filterSubset = new MyReadFilter(9,15,range('G','K')); $filterSubset = new MyReadFilter(9,15,range('G','K'));
``` ```
> See Examples/Reader/exampleReader10.php for a working example of this code.
This can be particularly useful for conserving memory, by allowing you to read and process a large workbook in “chunks”: an example of this usage might be when transferring data from an Excel worksheet to a database. > See Examples/Reader/exampleReader10.php for a working example of this
> code.
```php This can be particularly useful for conserving memory, by allowing you
to read and process a large workbook in “chunks”: an example of this
usage might be when transferring data from an Excel worksheet to a
database.
``` php
$inputFileType = 'Xls'; $inputFileType = 'Xls';
$inputFileName = './sampleData/example2.xls'; $inputFileName = './sampleData/example2.xls';
@ -295,21 +372,31 @@ for ($startRow = 2; $startRow <= 65536; $startRow += $chunkSize) {
// Do some processing here // Do some processing here
} }
``` ```
> See Examples/Reader/exampleReader12.php for a working example of this code.
> See Examples/Reader/exampleReader12.php for a working example of this
> code.
Using Read Filters applies to: Using Read Filters applies to:
Reader | Y/N |Reader | Y/N |Reader | Y/N | Reader | Y/N |Reader | Y/N |Reader | Y/N |
----------|:---:|--------|:---:|--------------|:---:| ----------|:---:|--------|:---:|--------------|:---:|
Xlsx | YES | Xls | YES | Excel2003XML | YES | Xlsx | YES | Xls | YES | Excel2003XML | YES |
Ods | YES | SYLK | NO | Gnumeric | YES | Ods | YES | SYLK | NO | Gnumeric | YES |
CSV | YES | HTML | NO CSV | YES | HTML | NO | | |
### Combining Multiple Files into a Single Spreadsheet Object ### Combining Multiple Files into a Single Spreadsheet Object
While you can limit the number of worksheets that are read from a workbook file using the setLoadSheetsOnly() method, certain readers also allow you to combine several individual "sheets" from different files into a single `Spreadsheet` object, where each individual file is a single worksheet within that workbook. For each file that you read, you need to indicate which worksheet index it should be loaded into using the setSheetIndex() method of the $reader, then use the loadIntoExisting() method rather than the load() method to actually read the file into that worksheet. While you can limit the number of worksheets that are read from a
workbook file using the setLoadSheetsOnly() method, certain readers also
allow you to combine several individual "sheets" from different files
into a single `Spreadsheet` object, where each individual file is a
single worksheet within that workbook. For each file that you read, you
need to indicate which worksheet index it should be loaded into using
the setSheetIndex() method of the \$reader, then use the
loadIntoExisting() method rather than the load() method to actually read
the file into that worksheet.
```php ``` php
$inputFileType = 'CSV'; $inputFileType = 'CSV';
$inputFileNames = array('./sampleData/example1.csv', $inputFileNames = array('./sampleData/example1.csv',
'./sampleData/example2.csv' './sampleData/example2.csv'
@ -340,23 +427,36 @@ foreach($inputFileNames as $sheet => $inputFileName) {
->setTitle(pathinfo($inputFileName,PATHINFO_BASENAME)); ->setTitle(pathinfo($inputFileName,PATHINFO_BASENAME));
} }
``` ```
> See Examples/Reader/exampleReader13.php for a working example of this code.
Note that using the same sheet index for multiple sheets won't append files into the same sheet, but overwrite the results of the previous load. You cannot load multiple CSV files into the same worksheet. > See Examples/Reader/exampleReader13.php for a working example of this
> code.
Note that using the same sheet index for multiple sheets won't append
files into the same sheet, but overwrite the results of the previous
load. You cannot load multiple CSV files into the same worksheet.
Combining Multiple Files into a Single Spreadsheet Object applies to: Combining Multiple Files into a Single Spreadsheet Object applies to:
Reader | Y/N |Reader | Y/N |Reader | Y/N | Reader | Y/N |Reader | Y/N |Reader | Y/N |
----------|:---:|--------|:---:|--------------|:---:| ----------|:---:|--------|:---:|--------------|:---:|
Xlsx | NO | Xls | NO | Excel2003XML | NO | Xlsx | NO | Xls | NO | Excel2003XML | NO |
Ods | NO | SYLK | YES | Gnumeric | NO | Ods | NO | SYLK | YES | Gnumeric | NO |
CSV | YES | HTML | NO CSV | YES | HTML | NO
### Combining Read Filters with the setSheetIndex() method to split a large CSV file across multiple Worksheets ### Combining Read Filters with the setSheetIndex() method to split a large CSV file across multiple Worksheets
An Xls BIFF .xls file is limited to 65536 rows in a worksheet, while the Xlsx Microsoft Office Open XML SpreadsheetML .xlsx file is limited to 1,048,576 rows in a worksheet; but a CSV file is not limited other than by available disk space. This means that we wouldnt ordinarily be able to read all the rows from a very large CSV file that exceeded those limits, and save it as an Xls or Xlsx file. However, by using Read Filters to read the CSV file in “chunks” (using the chunkReadFilter Class that we defined in section REF _Ref275604563 \r \p 5.3 above), and the setSheetIndex() method of the $reader, we can split the CSV file across several individual worksheets. An Xls BIFF .xls file is limited to 65536 rows in a worksheet, while the
Xlsx Microsoft Office Open XML SpreadsheetML .xlsx file is limited to
1,048,576 rows in a worksheet; but a CSV file is not limited other than
by available disk space. This means that we wouldnt ordinarily be able
to read all the rows from a very large CSV file that exceeded those
limits, and save it as an Xls or Xlsx file. However, by using Read
Filters to read the CSV file in “chunks” (using the chunkReadFilter
Class that we defined in section REF \_Ref275604563 \r \p 5.3 above),
and the setSheetIndex() method of the \$reader, we can split the CSV
file across several individual worksheets.
```php ``` php
$inputFileType = 'CSV'; $inputFileType = 'CSV';
$inputFileName = './sampleData/example2.csv'; $inputFileName = './sampleData/example2.csv';
@ -397,27 +497,38 @@ for ($startRow = 2; $startRow <= 1000000; $startRow += $chunkSize) {
$spreadsheet->getActiveSheet()->setTitle('Country Data #'.(++$sheet)); $spreadsheet->getActiveSheet()->setTitle('Country Data #'.(++$sheet));
} }
``` ```
> See Examples/Reader/exampleReader14.php for a working example of this code.
This code will read 65,530 rows at a time from the CSV file that were loading, and store each "chunk" in a new worksheet. > See Examples/Reader/exampleReader14.php for a working example of this
> code.
The setContiguous() method for the Reader is important here. It is applicable only when working with a Read Filter, and identifies whether or not the cells should be stored by their position within the CSV file, or their position relative to the filter. This code will read 65,530 rows at a time from the CSV file that were
loading, and store each "chunk" in a new worksheet.
For example, if the filter returned true for cells in the range B2:C3, then with setContiguous set to false (the default) these would be loaded as B2:C3 in the `Spreadsheet` object; but with setContiguous set to true, they would be loaded as A1:B2. The setContiguous() method for the Reader is important here. It is
applicable only when working with a Read Filter, and identifies whether
or not the cells should be stored by their position within the CSV file,
or their position relative to the filter.
For example, if the filter returned true for cells in the range B2:C3,
then with setContiguous set to false (the default) these would be loaded
as B2:C3 in the `Spreadsheet` object; but with setContiguous set to
true, they would be loaded as A1:B2.
Splitting a single loaded file across multiple worksheets applies to: Splitting a single loaded file across multiple worksheets applies to:
Reader | Y/N |Reader | Y/N |Reader | Y/N | Reader | Y/N |Reader | Y/N |Reader | Y/N |
----------|:---:|--------|:---:|--------------|:---:| ----------|:---:|--------|:---:|--------------|:---:|
Xlsx | NO | Xls | NO | Excel2003XML | NO | Xlsx | NO | Xls | NO | Excel2003XML | NO |
Ods | NO | SYLK | NO | Gnumeric | NO | Ods | NO | SYLK | NO | Gnumeric | NO |
CSV | YES | HTML | NO CSV | YES | HTML | NO
### Pipe or Tab Separated Value Files ### Pipe or Tab Separated Value Files
The CSV loader defaults to loading a file where comma is used as the separator, but you can modify this to load tab- or pipe-separated value files using the setDelimiter() method. The CSV loader defaults to loading a file where comma is used as the
separator, but you can modify this to load tab- or pipe-separated value
files using the setDelimiter() method.
```php ``` php
$inputFileType = 'CSV'; $inputFileType = 'CSV';
$inputFileName = './sampleData/example1.tsv'; $inputFileName = './sampleData/example1.tsv';
@ -429,31 +540,54 @@ $reader->setDelimiter("\t");
/** Load the file to a Spreadsheet Object **/ /** Load the file to a Spreadsheet Object **/
$spreadsheet = $reader->load($inputFileName); $spreadsheet = $reader->load($inputFileName);
``` ```
> See Examples/Reader/exampleReader15.php for a working example of this code.
In addition to the delimiter, you can also use the following methods to set other attributes for the data load: > See Examples/Reader/exampleReader15.php for a working example of this
> code.
setEnclosure() | default is " In addition to the delimiter, you can also use the following methods to
setLineEnding() | default is PHP_EOL set other attributes for the data load:
setEnclosure() | default is " setLineEnding() | default is PHP\_EOL
setInputEncoding() | default is UTF-8 setInputEncoding() | default is UTF-8
Setting CSV delimiter applies to: Setting CSV delimiter applies to:
Reader | Y/N |Reader | Y/N |Reader | Y/N | Reader | Y/N |Reader | Y/N |Reader | Y/N |
----------|:---:|--------|:---:|--------------|:---:| ----------|:---:|--------|:---:|--------------|:---:|
Xlsx | NO | Xls | NO | Excel2003XML | NO | Xlsx | NO | Xls | NO | Excel2003XML | NO |
Ods | NO | SYLK | NO | Gnumeric | NO | Ods | NO | SYLK | NO | Gnumeric | NO |
CSV | YES | HTML | NO CSV | YES | HTML | NO
### A Brief Word about the Advanced Value Binder ### A Brief Word about the Advanced Value Binder
When loading data from a file that contains no formatting information, such as a CSV file, then data is read either as strings or numbers (float or integer). This means that PhpSpreadsheet does not automatically recognise dates/times (such as "16-Apr-2009" or "13:30"), booleans ("TRUE" or "FALSE"), percentages ("75%"), hyperlinks ("http://www.phpexcel.net"), etc as anything other than simple strings. However, you can apply additional processing that is executed against these values during the load process within a Value Binder. When loading data from a file that contains no formatting information,
such as a CSV file, then data is read either as strings or numbers
(float or integer). This means that PhpSpreadsheet does not
automatically recognise dates/times (such as "16-Apr-2009" or "13:30"),
booleans ("TRUE" or "FALSE"), percentages ("75%"), hyperlinks
("http://www.phpexcel.net"), etc as anything other than simple strings.
However, you can apply additional processing that is executed against
these values during the load process within a Value Binder.
A Value Binder is a class that implement the \PhpOffice\PhpSpreadsheet\Cell\IValueBinder interface. It must contain a bindValue() method that accepts a \PhpOffice\PhpSpreadsheet\Cell and a value as arguments, and return a boolean true or false that indicates whether the workbook cell has been populated with the value or not. The Advanced Value Binder implements such a class: amongst other tests, it identifies a string comprising "TRUE" or "FALSE" (based on locale settings) and sets it to a boolean; or a number in scientific format (e.g. "1.234e-5") and converts it to a float; or dates and times, converting them to their Excel timestamp value before storing the value in the cell object. It also sets formatting for strings that are identified as dates, times or percentages. It could easily be extended to provide additional handling (including text or cell formatting) when it encountered a hyperlink, or HTML markup within a CSV file. A Value Binder is a class that implement the
\PhpOffice\PhpSpreadsheet\Cell\IValueBinder interface. It must contain a
bindValue() method that accepts a \PhpOffice\PhpSpreadsheet\Cell and a
value as arguments, and return a boolean true or false that indicates
whether the workbook cell has been populated with the value or not. The
Advanced Value Binder implements such a class: amongst other tests, it
identifies a string comprising "TRUE" or "FALSE" (based on locale
settings) and sets it to a boolean; or a number in scientific format
(e.g. "1.234e-5") and converts it to a float; or dates and times,
converting them to their Excel timestamp value before storing the
value in the cell object. It also sets formatting for strings that are
identified as dates, times or percentages. It could easily be extended
to provide additional handling (including text or cell formatting) when
it encountered a hyperlink, or HTML markup within a CSV file.
So using a Value Binder allows a great deal more flexibility in the loader logic when reading unformatted text files. So using a Value Binder allows a great deal more flexibility in the
loader logic when reading unformatted text files.
```php ``` php
/** Tell PhpSpreadsheet that we want to use the Advanced Value Binder **/ /** Tell PhpSpreadsheet that we want to use the Advanced Value Binder **/
\PhpOffice\PhpSpreadsheet\Cell::setValueBinder( new \PhpOffice\PhpSpreadsheet\Cell\AdvancedValueBinder() ); \PhpOffice\PhpSpreadsheet\Cell::setValueBinder( new \PhpOffice\PhpSpreadsheet\Cell\AdvancedValueBinder() );
@ -464,27 +598,32 @@ $reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
$reader->setDelimiter("\t"); $reader->setDelimiter("\t");
$spreadsheet = $reader->load($inputFileName); $spreadsheet = $reader->load($inputFileName);
``` ```
> See Examples/Reader/exampleReader15.php for a working example of this code.
> See Examples/Reader/exampleReader15.php for a working example of this
> code.
Loading using a Value Binder applies to: Loading using a Value Binder applies to:
Reader | Y/N |Reader | Y/N |Reader | Y/N Reader | Y/N |Reader | Y/N |Reader | Y/N
----------|:---:|--------|:---:|--------------|:---: ----------|:---:|--------|:---:|--------------|:---:
Xlsx | NO | Xls | NO | Excel2003XML | NO Xlsx | NO | Xls | NO | Excel2003XML | NO
Ods | NO | SYLK | NO | Gnumeric | NO Ods | NO | SYLK | NO | Gnumeric | NO
CSV | YES | HTML | YES CSV | YES | HTML | YES
## Spreadsheet Reader Options ## Spreadsheet Reader Options
Once you have created a reader object for the workbook that you want to load, you have the opportunity to set additional options before executing the load() method. Once you have created a reader object for the workbook that you want to
load, you have the opportunity to set additional options before
executing the load() method.
### Reading Only Data from a Spreadsheet File ### Reading Only Data from a Spreadsheet File
If you're only interested in the cell values in a workbook, but don't need any of the cell formatting information, then you can set the reader to read only the data values and any formulae from each cell using the setReadDataOnly() method. If you're only interested in the cell values in a workbook, but don't
need any of the cell formatting information, then you can set the reader
to read only the data values and any formulae from each cell using the
setReadDataOnly() method.
```php ``` php
$inputFileType = 'Xls'; $inputFileType = 'Xls';
$inputFileName = './sampleData/example1.xls'; $inputFileName = './sampleData/example1.xls';
@ -495,27 +634,41 @@ $reader->setReadDataOnly(true);
/** Load $inputFileName to a Spreadsheet Object **/ /** Load $inputFileName to a Spreadsheet Object **/
$spreadsheet = $reader->load($inputFileName); $spreadsheet = $reader->load($inputFileName);
``` ```
> See Examples/Reader/exampleReader05.php for a working example of this code.
It is important to note that Workbooks (and PhpSpreadsheet) store dates and times as simple numeric values: they can only be distinguished from other numeric values by the format mask that is applied to that cell. When setting read data only to true, PhpSpreadsheet doesn't read the cell format masks, so it is not possible to differentiate between dates/times and numbers. > See Examples/Reader/exampleReader05.php for a working example of this
> code.
The Gnumeric loader has been written to read the format masks for date values even when read data only has been set to true, so it can differentiate between dates/times and numbers; but this change hasn't yet been implemented for the other readers. It is important to note that Workbooks (and PhpSpreadsheet) store dates
and times as simple numeric values: they can only be distinguished from
other numeric values by the format mask that is applied to that cell.
When setting read data only to true, PhpSpreadsheet doesn't read the
cell format masks, so it is not possible to differentiate between
dates/times and numbers.
The Gnumeric loader has been written to read the format masks for date
values even when read data only has been set to true, so it can
differentiate between dates/times and numbers; but this change hasn't
yet been implemented for the other readers.
Reading Only Data from a Spreadsheet File applies to Readers: Reading Only Data from a Spreadsheet File applies to Readers:
Reader | Y/N |Reader | Y/N |Reader | Y/N | Reader | Y/N |Reader | Y/N |Reader | Y/N |
----------|:---:|--------|:---:|--------------|:---:| ----------|:---:|--------|:---:|--------------|:---:|
Xlsx | YES | Xls | YES | Excel2003XML | YES | Xlsx | YES | Xls | YES | Excel2003XML | YES |
Ods | YES | SYLK | NO | Gnumeric | YES | Ods | YES | SYLK | NO | Gnumeric | YES |
CSV | NO | HTML | NO CSV | NO | HTML | NO
### Reading Only Named WorkSheets from a File ### Reading Only Named WorkSheets from a File
If your workbook contains a number of worksheets, but you are only interested in reading some of those, then you can use the setLoadSheetsOnly() method to identify those sheets you are interested in reading. If your workbook contains a number of worksheets, but you are only
interested in reading some of those, then you can use the
setLoadSheetsOnly() method to identify those sheets you are interested
in reading.
To read a single sheet, you can pass that sheet name as a parameter to the setLoadSheetsOnly() method. To read a single sheet, you can pass that sheet name as a parameter to
the setLoadSheetsOnly() method.
```php ``` php
$inputFileType = 'Xls'; $inputFileType = 'Xls';
$inputFileName = './sampleData/example1.xls'; $inputFileName = './sampleData/example1.xls';
$sheetname = 'Data Sheet #2'; $sheetname = 'Data Sheet #2';
@ -527,11 +680,14 @@ $reader->setLoadSheetsOnly($sheetname);
/** Load $inputFileName to a Spreadsheet Object **/ /** Load $inputFileName to a Spreadsheet Object **/
$spreadsheet = $reader->load($inputFileName); $spreadsheet = $reader->load($inputFileName);
``` ```
> See Examples/Reader/exampleReader07.php for a working example of this code.
If you want to read more than just a single sheet, you can pass a list of sheet names as an array parameter to the setLoadSheetsOnly() method. > See Examples/Reader/exampleReader07.php for a working example of this
> code.
```php If you want to read more than just a single sheet, you can pass a list
of sheet names as an array parameter to the setLoadSheetsOnly() method.
``` php
$inputFileType = 'Xls'; $inputFileType = 'Xls';
$inputFileName = './sampleData/example1.xls'; $inputFileName = './sampleData/example1.xls';
$sheetnames = array('Data Sheet #1','Data Sheet #3'); $sheetnames = array('Data Sheet #1','Data Sheet #3');
@ -543,11 +699,14 @@ $reader->setLoadSheetsOnly($sheetnames);
/** Load $inputFileName to a Spreadsheet Object **/ /** Load $inputFileName to a Spreadsheet Object **/
$spreadsheet = $reader->load($inputFileName); $spreadsheet = $reader->load($inputFileName);
``` ```
> See Examples/Reader/exampleReader08.php for a working example of this code.
To reset this option to the default, you can call the setLoadAllSheets() method. > See Examples/Reader/exampleReader08.php for a working example of this
> code.
```php To reset this option to the default, you can call the setLoadAllSheets()
method.
``` php
$inputFileType = 'Xls'; $inputFileType = 'Xls';
$inputFileName = './sampleData/example1.xls'; $inputFileName = './sampleData/example1.xls';
@ -558,7 +717,9 @@ $reader->setLoadAllSheets();
/** Load $inputFileName to a Spreadsheet Object **/ /** Load $inputFileName to a Spreadsheet Object **/
$spreadsheet = $reader->load($inputFileName); $spreadsheet = $reader->load($inputFileName);
``` ```
> See Examples/Reader/exampleReader06.php for a working example of this code.
> See Examples/Reader/exampleReader06.php for a working example of this
> code.
Reading Only Named WorkSheets from a File applies to Readers: Reading Only Named WorkSheets from a File applies to Readers:
@ -570,9 +731,16 @@ CSV | NO | HTML | NO
### Reading Only Specific Columns and Rows from a File (Read Filters) ### Reading Only Specific Columns and Rows from a File (Read Filters)
If you are only interested in reading part of a worksheet, then you can write a filter class that identifies whether or not individual cells should be read by the loader. A read filter must implement the \PhpOffice\PhpSpreadsheet\Reader\IReadFilter interface, and contain a readCell() method that accepts arguments of $column, $row and $worksheetName, and return a boolean true or false that indicates whether a workbook cell identified by those arguments should be read or not. If you are only interested in reading part of a worksheet, then you can
write a filter class that identifies whether or not individual cells
should be read by the loader. A read filter must implement the
\PhpOffice\PhpSpreadsheet\Reader\IReadFilter interface, and contain a
readCell() method that accepts arguments of \$column, \$row and
\$worksheetName, and return a boolean true or false that indicates
whether a workbook cell identified by those arguments should be read or
not.
```php ``` php
$inputFileType = 'Xls'; $inputFileType = 'Xls';
$inputFileName = './sampleData/example1.xls'; $inputFileName = './sampleData/example1.xls';
$sheetname = 'Data Sheet #3'; $sheetname = 'Data Sheet #3';
@ -602,11 +770,16 @@ $reader->setReadFilter($filterSubset);
/** Load only the rows and columns that match our filter to Spreadsheet **/ /** Load only the rows and columns that match our filter to Spreadsheet **/
$spreadsheet = $reader->load($inputFileName); $spreadsheet = $reader->load($inputFileName);
``` ```
> See Examples/Reader/exampleReader09.php for a working example of this code.
This example is not particularly useful, because it can only be used in a very specific circumstance (when you only want cells in the range A1:E7 from your worksheet. A generic Read Filter would probably be more useful: > See Examples/Reader/exampleReader09.php for a working example of this
> code.
```php This example is not particularly useful, because it can only be used in
a very specific circumstance (when you only want cells in the range
A1:E7 from your worksheet. A generic Read Filter would probably be more
useful:
``` php
/** Define a Read Filter class implementing \PhpOffice\PhpSpreadsheet\Reader\IReadFilter */ /** Define a Read Filter class implementing \PhpOffice\PhpSpreadsheet\Reader\IReadFilter */
class MyReadFilter implements \PhpOffice\PhpSpreadsheet\Reader\IReadFilter class MyReadFilter implements \PhpOffice\PhpSpreadsheet\Reader\IReadFilter
{ {
@ -635,11 +808,16 @@ class MyReadFilter implements \PhpOffice\PhpSpreadsheet\Reader\IReadFilter
/** Create an Instance of our Read Filter, passing in the cell range **/ /** Create an Instance of our Read Filter, passing in the cell range **/
$filterSubset = new MyReadFilter(9,15,range('G','K')); $filterSubset = new MyReadFilter(9,15,range('G','K'));
``` ```
> See Examples/Reader/exampleReader10.php for a working example of this code.
This can be particularly useful for conserving memory, by allowing you to read and process a large workbook in “chunks”: an example of this usage might be when transferring data from an Excel worksheet to a database. > See Examples/Reader/exampleReader10.php for a working example of this
> code.
```php This can be particularly useful for conserving memory, by allowing you
to read and process a large workbook in “chunks”: an example of this
usage might be when transferring data from an Excel worksheet to a
database.
``` php
$inputFileType = 'Xls'; $inputFileType = 'Xls';
$inputFileName = './sampleData/example2.xls'; $inputFileName = './sampleData/example2.xls';
@ -687,7 +865,9 @@ for ($startRow = 2; $startRow <= 65536; $startRow += $chunkSize) {
// Do some processing here // Do some processing here
} }
``` ```
> See Examples/Reader/exampleReader12.php for a working example of this code.
> See Examples/Reader/exampleReader12.php for a working example of this
> code.
Using Read Filters applies to: Using Read Filters applies to:
@ -699,9 +879,17 @@ CSV | YES | HTML | NO
### Combining Multiple Files into a Single Spreadsheet Object ### Combining Multiple Files into a Single Spreadsheet Object
While you can limit the number of worksheets that are read from a workbook file using the setLoadSheetsOnly() method, certain readers also allow you to combine several individual "sheets" from different files into a single `Spreadsheet` object, where each individual file is a single worksheet within that workbook. For each file that you read, you need to indicate which worksheet index it should be loaded into using the setSheetIndex() method of the $reader, then use the loadIntoExisting() method rather than the load() method to actually read the file into that worksheet. While you can limit the number of worksheets that are read from a
workbook file using the setLoadSheetsOnly() method, certain readers also
allow you to combine several individual "sheets" from different files
into a single `Spreadsheet` object, where each individual file is a
single worksheet within that workbook. For each file that you read, you
need to indicate which worksheet index it should be loaded into using
the setSheetIndex() method of the \$reader, then use the
loadIntoExisting() method rather than the load() method to actually read
the file into that worksheet.
```php ``` php
$inputFileType = 'CSV'; $inputFileType = 'CSV';
$inputFileNames = array('./sampleData/example1.csv', $inputFileNames = array('./sampleData/example1.csv',
'./sampleData/example2.csv' './sampleData/example2.csv'
@ -732,9 +920,13 @@ foreach($inputFileNames as $sheet => $inputFileName) {
->setTitle(pathinfo($inputFileName,PATHINFO_BASENAME)); ->setTitle(pathinfo($inputFileName,PATHINFO_BASENAME));
} }
``` ```
> See Examples/Reader/exampleReader13.php for a working example of this code.
Note that using the same sheet index for multiple sheets won't append files into the same sheet, but overwrite the results of the previous load. You cannot load multiple CSV files into the same worksheet. > See Examples/Reader/exampleReader13.php for a working example of this
> code.
Note that using the same sheet index for multiple sheets won't append
files into the same sheet, but overwrite the results of the previous
load. You cannot load multiple CSV files into the same worksheet.
Combining Multiple Files into a Single Spreadsheet Object applies to: Combining Multiple Files into a Single Spreadsheet Object applies to:
@ -744,11 +936,20 @@ Xlsx | NO | Xls | NO | Excel2003XML | NO |
Ods | NO | SYLK | YES | Gnumeric | NO | Ods | NO | SYLK | YES | Gnumeric | NO |
CSV | YES | HTML | NO CSV | YES | HTML | NO
### Combining Read Filters with the setSheetIndex() method to split a large CSV file across multiple Worksheets ### Combining Read Filters with the setSheetIndex() method to split a large CSV file across multiple Worksheets
An Xls BIFF .xls file is limited to 65536 rows in a worksheet, while the Xlsx Microsoft Office Open XML SpreadsheetML .xlsx file is limited to 1,048,576 rows in a worksheet; but a CSV file is not limited other than by available disk space. This means that we wouldnt ordinarily be able to read all the rows from a very large CSV file that exceeded those limits, and save it as an Xls or Xlsx file. However, by using Read Filters to read the CSV file in “chunks” (using the chunkReadFilter Class that we defined in section REF _Ref275604563 \r \p 5.3 above), and the setSheetIndex() method of the $reader, we can split the CSV file across several individual worksheets. An Xls BIFF .xls file is limited to 65536 rows in a worksheet, while the
Xlsx Microsoft Office Open XML SpreadsheetML .xlsx file is limited to
1,048,576 rows in a worksheet; but a CSV file is not limited other than
by available disk space. This means that we wouldnt ordinarily be able
to read all the rows from a very large CSV file that exceeded those
limits, and save it as an Xls or Xlsx file. However, by using Read
Filters to read the CSV file in “chunks” (using the chunkReadFilter
Class that we defined in section REF \_Ref275604563 \r \p 5.3 above),
and the setSheetIndex() method of the \$reader, we can split the CSV
file across several individual worksheets.
```php ``` php
$inputFileType = 'CSV'; $inputFileType = 'CSV';
$inputFileName = './sampleData/example2.csv'; $inputFileName = './sampleData/example2.csv';
@ -789,13 +990,22 @@ for ($startRow = 2; $startRow <= 1000000; $startRow += $chunkSize) {
$spreadsheet->getActiveSheet()->setTitle('Country Data #'.(++$sheet)); $spreadsheet->getActiveSheet()->setTitle('Country Data #'.(++$sheet));
} }
``` ```
> See Examples/Reader/exampleReader14.php for a working example of this code.
This code will read 65,530 rows at a time from the CSV file that were loading, and store each "chunk" in a new worksheet. > See Examples/Reader/exampleReader14.php for a working example of this
> code.
The setContiguous() method for the Reader is important here. It is applicable only when working with a Read Filter, and identifies whether or not the cells should be stored by their position within the CSV file, or their position relative to the filter. This code will read 65,530 rows at a time from the CSV file that were
loading, and store each "chunk" in a new worksheet.
For example, if the filter returned true for cells in the range B2:C3, then with setContiguous set to false (the default) these would be loaded as B2:C3 in the `Spreadsheet` object; but with setContiguous set to true, they would be loaded as A1:B2. The setContiguous() method for the Reader is important here. It is
applicable only when working with a Read Filter, and identifies whether
or not the cells should be stored by their position within the CSV file,
or their position relative to the filter.
For example, if the filter returned true for cells in the range B2:C3,
then with setContiguous set to false (the default) these would be loaded
as B2:C3 in the `Spreadsheet` object; but with setContiguous set to
true, they would be loaded as A1:B2.
Splitting a single loaded file across multiple worksheets applies to: Splitting a single loaded file across multiple worksheets applies to:
@ -807,9 +1017,11 @@ CSV | YES | HTML | NO
### Pipe or Tab Separated Value Files ### Pipe or Tab Separated Value Files
The CSV loader defaults to loading a file where comma is used as the separator, but you can modify this to load tab- or pipe-separated value files using the setDelimiter() method. The CSV loader defaults to loading a file where comma is used as the
separator, but you can modify this to load tab- or pipe-separated value
files using the setDelimiter() method.
```php ``` php
$inputFileType = 'CSV'; $inputFileType = 'CSV';
$inputFileName = './sampleData/example1.tsv'; $inputFileName = './sampleData/example1.tsv';
@ -821,12 +1033,14 @@ $reader->setDelimiter("\t");
/** Load the file to a Spreadsheet Object **/ /** Load the file to a Spreadsheet Object **/
$spreadsheet = $reader->load($inputFileName); $spreadsheet = $reader->load($inputFileName);
``` ```
> See Examples/Reader/exampleReader15.php for a working example of this code.
In addition to the delimiter, you can also use the following methods to set other attributes for the data load: > See Examples/Reader/exampleReader15.php for a working example of this
> code.
setEnclosure() | default is " In addition to the delimiter, you can also use the following methods to
setLineEnding() | default is PHP_EOL set other attributes for the data load:
setEnclosure() | default is " setLineEnding() | default is PHP\_EOL
setInputEncoding() | default is UTF-8 setInputEncoding() | default is UTF-8
Setting CSV delimiter applies to: Setting CSV delimiter applies to:
@ -839,13 +1053,34 @@ CSV | YES | HTML | NO
### A Brief Word about the Advanced Value Binder ### A Brief Word about the Advanced Value Binder
When loading data from a file that contains no formatting information, such as a CSV file, then data is read either as strings or numbers (float or integer). This means that PhpSpreadsheet does not automatically recognise dates/times (such as "16-Apr-2009" or "13:30"), booleans ("TRUE" or "FALSE"), percentages ("75%"), hyperlinks ("http://www.phpexcel.net"), etc as anything other than simple strings. However, you can apply additional processing that is executed against these values during the load process within a Value Binder. When loading data from a file that contains no formatting information,
such as a CSV file, then data is read either as strings or numbers
(float or integer). This means that PhpSpreadsheet does not
automatically recognise dates/times (such as "16-Apr-2009" or "13:30"),
booleans ("TRUE" or "FALSE"), percentages ("75%"), hyperlinks
("http://www.phpexcel.net"), etc as anything other than simple strings.
However, you can apply additional processing that is executed against
these values during the load process within a Value Binder.
A Value Binder is a class that implement the \PhpOffice\PhpSpreadsheet\Cell\IValueBinder interface. It must contain a bindValue() method that accepts a \PhpOffice\PhpSpreadsheet\Cell and a value as arguments, and return a boolean true or false that indicates whether the workbook cell has been populated with the value or not. The Advanced Value Binder implements such a class: amongst other tests, it identifies a string comprising "TRUE" or "FALSE" (based on locale settings) and sets it to a boolean; or a number in scientific format (e.g. "1.234e-5") and converts it to a float; or dates and times, converting them to their Excel timestamp value before storing the value in the cell object. It also sets formatting for strings that are identified as dates, times or percentages. It could easily be extended to provide additional handling (including text or cell formatting) when it encountered a hyperlink, or HTML markup within a CSV file. A Value Binder is a class that implement the
\PhpOffice\PhpSpreadsheet\Cell\IValueBinder interface. It must contain a
bindValue() method that accepts a \PhpOffice\PhpSpreadsheet\Cell and a
value as arguments, and return a boolean true or false that indicates
whether the workbook cell has been populated with the value or not. The
Advanced Value Binder implements such a class: amongst other tests, it
identifies a string comprising "TRUE" or "FALSE" (based on locale
settings) and sets it to a boolean; or a number in scientific format
(e.g. "1.234e-5") and converts it to a float; or dates and times,
converting them to their Excel timestamp value before storing the
value in the cell object. It also sets formatting for strings that are
identified as dates, times or percentages. It could easily be extended
to provide additional handling (including text or cell formatting) when
it encountered a hyperlink, or HTML markup within a CSV file.
So using a Value Binder allows a great deal more flexibility in the loader logic when reading unformatted text files. So using a Value Binder allows a great deal more flexibility in the
loader logic when reading unformatted text files.
```php ``` php
/** Tell PhpSpreadsheet that we want to use the Advanced Value Binder **/ /** Tell PhpSpreadsheet that we want to use the Advanced Value Binder **/
\PhpOffice\PhpSpreadsheet\Cell::setValueBinder( new \PhpOffice\PhpSpreadsheet\Cell\AdvancedValueBinder() ); \PhpOffice\PhpSpreadsheet\Cell::setValueBinder( new \PhpOffice\PhpSpreadsheet\Cell\AdvancedValueBinder() );
@ -856,7 +1091,9 @@ $reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
$reader->setDelimiter("\t"); $reader->setDelimiter("\t");
$spreadsheet = $reader->load($inputFileName); $spreadsheet = $reader->load($inputFileName);
``` ```
> See Examples/Reader/exampleReader15.php for a working example of this code.
> See Examples/Reader/exampleReader15.php for a working example of this
> code.
Loading using a Value Binder applies to: Loading using a Value Binder applies to:
@ -866,15 +1103,18 @@ Xlsx | NO | Xls | NO | Excel2003XML | NO
Ods | NO | SYLK | NO | Gnumeric | NO Ods | NO | SYLK | NO | Gnumeric | NO
CSV | YES | HTML | YES CSV | YES | HTML | YES
## Error Handling ## Error Handling
Of course, you should always apply some error handling to your scripts as well. PhpSpreadsheet throws exceptions, so you can wrap all your code that accesses the library methods within Try/Catch blocks to trap for any problems that are encountered, and deal with them in an appropriate manner. Of course, you should always apply some error handling to your scripts
as well. PhpSpreadsheet throws exceptions, so you can wrap all your code
that accesses the library methods within Try/Catch blocks to trap for
any problems that are encountered, and deal with them in an appropriate
manner.
The PhpSpreadsheet Readers throw a \PhpOffice\PhpSpreadsheet\Reader\Exception. The PhpSpreadsheet Readers throw a
\PhpOffice\PhpSpreadsheet\Reader\Exception.
```php ``` php
$inputFileName = './sampleData/example-1.xls'; $inputFileName = './sampleData/example-1.xls';
try { try {
@ -884,19 +1124,24 @@ try {
die('Error loading file: '.$e->getMessage()); die('Error loading file: '.$e->getMessage());
} }
``` ```
> See Examples/Reader/exampleReader16.php for a working example of this code.
> See Examples/Reader/exampleReader16.php for a working example of this
> code.
## Helper Methods ## Helper Methods
You can retrieve a list of worksheet names contained in a file without loading the whole file by using the Readers `listWorksheetNames()` method; similarly, a `listWorksheetInfo()` method will retrieve the dimensions of worksheet in a file without needing to load and parse the whole file. You can retrieve a list of worksheet names contained in a file without
loading the whole file by using the Readers `listWorksheetNames()`
method; similarly, a `listWorksheetInfo()` method will retrieve the
dimensions of worksheet in a file without needing to load and parse the
whole file.
### listWorksheetNames ### listWorksheetNames
The `listWorksheetNames()` method returns a simple array listing each worksheet name within the workbook: The `listWorksheetNames()` method returns a simple array listing each
worksheet name within the workbook:
```php ``` php
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType); $reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
$worksheetNames = $reader->listWorksheetNames($inputFileName); $worksheetNames = $reader->listWorksheetNames($inputFileName);
@ -908,13 +1153,16 @@ foreach ($worksheetNames as $worksheetName) {
} }
echo '</ol>'; echo '</ol>';
``` ```
> See Examples/Reader/exampleReader18.php for a working example of this code.
> See Examples/Reader/exampleReader18.php for a working example of this
> code.
### listWorksheetInfo ### listWorksheetInfo
The `listWorksheetInfo()` method returns a nested array, with each entry listing the name and dimensions for a worksheet: The `listWorksheetInfo()` method returns a nested array, with each entry
listing the name and dimensions for a worksheet:
```php ``` php
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType); $reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
$worksheetData = $reader->listWorksheetInfo($inputFileName); $worksheetData = $reader->listWorksheetInfo($inputFileName);
@ -931,4 +1179,6 @@ foreach ($worksheetData as $worksheet) {
} }
echo '</ol>'; echo '</ol>';
``` ```
> See Examples/Reader/exampleReader19.php for a working example of this code.
> See Examples/Reader/exampleReader19.php for a working example of this
> code.

File diff suppressed because it is too large Load Diff

View File

@ -1,46 +1,78 @@
# Configuration Settings # Configuration Settings
Once you have included the PhpSpreadsheet files in your script, but before instantiating a `Spreadsheet` object or loading a workbook file, there are a number of configuration options that can be set which will affect the subsequent behaviour of the script. Once you have included the PhpSpreadsheet files in your script, but
before instantiating a `Spreadsheet` object or loading a workbook file,
there are a number of configuration options that can be set which will
affect the subsequent behaviour of the script.
## Cell Caching ## Cell Caching
PhpSpreadsheet uses an average of about 1k/cell in your worksheets, so large workbooks can quickly use up available memory. Cell caching provides a mechanism that allows PhpSpreadsheet to maintain the cell objects in a smaller size of memory, on disk, or in APC, memcache or Wincache, rather than in PHP memory. This allows you to reduce the memory usage for large workbooks, although at a cost of speed to access cell data. PhpSpreadsheet uses an average of about 1k/cell in your worksheets, so
large workbooks can quickly use up available memory. Cell caching
provides a mechanism that allows PhpSpreadsheet to maintain the cell
objects in a smaller size of memory, on disk, or in APC, memcache or
Wincache, rather than in PHP memory. This allows you to reduce the
memory usage for large workbooks, although at a cost of speed to access
cell data.
By default, PhpSpreadsheet still holds all cell objects in memory, but you can specify alternatives. To enable cell caching, you must call the \PhpOffice\PhpSpreadsheet\Settings::setCacheStorageMethod() method, passing in the caching method that you wish to use. By default, PhpSpreadsheet still holds all cell objects in memory, but
you can specify alternatives. To enable cell caching, you must call the
\PhpOffice\PhpSpreadsheet\Settings::setCacheStorageMethod() method,
passing in the caching method that you wish to use.
```php ``` php
$cacheMethod = \PhpOffice\PhpSpreadsheet\CachedObjectStorageFactory::CACHE_IN_MEMORY; $cacheMethod = \PhpOffice\PhpSpreadsheet\CachedObjectStorageFactory::CACHE_IN_MEMORY;
\PhpOffice\PhpSpreadsheet\Settings::setCacheStorageMethod($cacheMethod); \PhpOffice\PhpSpreadsheet\Settings::setCacheStorageMethod($cacheMethod);
``` ```
setCacheStorageMethod() will return a boolean true on success, false on failure (for example if trying to cache to APC when APC is not enabled). setCacheStorageMethod() will return a boolean true on success, false on
failure (for example if trying to cache to APC when APC is not enabled).
A separate cache is maintained for each individual worksheet, and is automatically created when the worksheet is instantiated based on the caching method and settings that you have configured. You cannot change the configuration settings once you have started to read a workbook, or have created your first worksheet. A separate cache is maintained for each individual worksheet, and is
automatically created when the worksheet is instantiated based on the
caching method and settings that you have configured. You cannot change
the configuration settings once you have started to read a workbook, or
have created your first worksheet.
Currently, the following caching methods are available. Currently, the following caching methods are available.
### \PhpOffice\PhpSpreadsheet\CachedObjectStorageFactory::CACHE_IN_MEMORY ### \PhpOffice\PhpSpreadsheet\CachedObjectStorageFactory::CACHE\_IN\_MEMORY
The default. If you don't initialise any caching method, then this is the method that PhpSpreadsheet will use. Cell objects are maintained in PHP memory as at present. The default. If you don't initialise any caching method, then this is
the method that PhpSpreadsheet will use. Cell objects are maintained in
PHP memory as at present.
### \PhpOffice\PhpSpreadsheet\CachedObjectStorageFactory::CACHE_IN_MEMORY_SERIALIZED ### \PhpOffice\PhpSpreadsheet\CachedObjectStorageFactory::CACHE\_IN\_MEMORY\_SERIALIZED
Using this caching method, cells are held in PHP memory as an array of serialized objects, which reduces the memory footprint with minimal performance overhead. Using this caching method, cells are held in PHP memory as an array of
serialized objects, which reduces the memory footprint with minimal
performance overhead.
### \PhpOffice\PhpSpreadsheet\CachedObjectStorageFactory::CACHE_IN_MEMORY_GZIP ### \PhpOffice\PhpSpreadsheet\CachedObjectStorageFactory::CACHE\_IN\_MEMORY\_GZIP
Like cache_in_memory_serialized, this method holds cells in PHP memory as an array of serialized objects, but gzipped to reduce the memory usage still further, although access to read or write a cell is slightly slower. Like cache\_in\_memory\_serialized, this method holds cells in PHP
memory as an array of serialized objects, but gzipped to reduce the
memory usage still further, although access to read or write a cell is
slightly slower.
### \PhpOffice\PhpSpreadsheet\CachedObjectStorageFactory::CACHE_IGBINARY ### \PhpOffice\PhpSpreadsheet\CachedObjectStorageFactory::CACHE\_IGBINARY
Uses PHPs igbinary extension (if its available) to serialize cell objects in memory. This is normally faster and uses less memory than standard PHP serialization, but isnt available in most hosting environments. Uses PHPs igbinary extension (if its available) to serialize cell
objects in memory. This is normally faster and uses less memory than
standard PHP serialization, but isnt available in most hosting
environments.
### \PhpOffice\PhpSpreadsheet\CachedObjectStorageFactory::CACHE_TO_DISCISAM ### \PhpOffice\PhpSpreadsheet\CachedObjectStorageFactory::CACHE\_TO\_DISCISAM
When using CACHE_TO_DISCISAM all cells are held in a temporary disk file, with only an index to their location in that file maintained in PHP memory. This is slower than any of the CACHE_IN_MEMORY methods, but significantly reduces the memory footprint. By default, PhpSpreadsheet will use PHP's temp directory for the cache file, but you can specify a different directory when initialising CACHE_TO_DISCISAM. When using CACHE\_TO\_DISCISAM all cells are held in a temporary disk
file, with only an index to their location in that file maintained in
PHP memory. This is slower than any of the CACHE\_IN\_MEMORY methods,
but significantly reduces the memory footprint. By default,
PhpSpreadsheet will use PHP's temp directory for the cache file, but you
can specify a different directory when initialising CACHE\_TO\_DISCISAM.
```php ``` php
$cacheMethod = \PhpOffice\PhpSpreadsheet\CachedObjectStorageFactory::CACHE_TO_DISCISAM; $cacheMethod = \PhpOffice\PhpSpreadsheet\CachedObjectStorageFactory::CACHE_TO_DISCISAM;
$cacheSettings = array( $cacheSettings = array(
'dir' => '/usr/local/tmp' 'dir' => '/usr/local/tmp'
@ -48,13 +80,19 @@ $cacheSettings = array(
\PhpOffice\PhpSpreadsheet\Settings::setCacheStorageMethod($cacheMethod, $cacheSettings); \PhpOffice\PhpSpreadsheet\Settings::setCacheStorageMethod($cacheMethod, $cacheSettings);
``` ```
The temporary disk file is automatically deleted when your script terminates. The temporary disk file is automatically deleted when your script
terminates.
### \PhpOffice\PhpSpreadsheet\CachedObjectStorageFactory::CACHE_TO_PHPTEMP ### \PhpOffice\PhpSpreadsheet\CachedObjectStorageFactory::CACHE\_TO\_PHPTEMP
Like CACHE_TO_DISCISAM, when using CACHE_TO_PHPTEMP all cells are held in the php://temp I/O stream, with only an index to their location maintained in PHP memory. In PHP, the php://memory wrapper stores data in the memory: php://temp behaves similarly, but uses a temporary file for storing the data when a certain memory limit is reached. The default is 1 MB, but you can change this when initialising CACHE_TO_PHPTEMP. Like CACHE\_TO\_DISCISAM, when using CACHE\_TO\_PHPTEMP all cells are
held in the php://temp I/O stream, with only an index to their location
maintained in PHP memory. In PHP, the php://memory wrapper stores data
in the memory: php://temp behaves similarly, but uses a temporary file
for storing the data when a certain memory limit is reached. The default
is 1 MB, but you can change this when initialising CACHE\_TO\_PHPTEMP.
```php ``` php
$cacheMethod = \PhpOffice\PhpSpreadsheet\CachedObjectStorageFactory::CACHE_TO_PHPTEMP; $cacheMethod = \PhpOffice\PhpSpreadsheet\CachedObjectStorageFactory::CACHE_TO_PHPTEMP;
$cacheSettings = array( $cacheSettings = array(
'memoryCacheSize' => '8MB' 'memoryCacheSize' => '8MB'
@ -62,13 +100,18 @@ $cacheSettings = array(
\PhpOffice\PhpSpreadsheet\Settings::setCacheStorageMethod($cacheMethod, $cacheSettings); \PhpOffice\PhpSpreadsheet\Settings::setCacheStorageMethod($cacheMethod, $cacheSettings);
``` ```
The php://temp file is automatically deleted when your script terminates. The php://temp file is automatically deleted when your script
terminates.
### \PhpOffice\PhpSpreadsheet\CachedObjectStorageFactory::CACHE_TO_APC ### \PhpOffice\PhpSpreadsheet\CachedObjectStorageFactory::CACHE\_TO\_APC
When using CACHE_TO_APC, cell objects are maintained in APC with only an index maintained in PHP memory to identify that the cell exists. By default, an APC cache timeout of 600 seconds is used, which should be enough for most applications: although it is possible to change this when initialising CACHE_TO_APC. When using CACHE\_TO\_APC, cell objects are maintained in APC with only
an index maintained in PHP memory to identify that the cell exists. By
default, an APC cache timeout of 600 seconds is used, which should be
enough for most applications: although it is possible to change this
when initialising CACHE\_TO\_APC.
```php ``` php
$cacheMethod = \PhpOffice\PhpSpreadsheet\CachedObjectStorageFactory::CACHE_TO_APC; $cacheMethod = \PhpOffice\PhpSpreadsheet\CachedObjectStorageFactory::CACHE_TO_APC;
$cacheSettings = array( $cacheSettings = array(
'cacheTime' => 600 'cacheTime' => 600
@ -76,15 +119,22 @@ $cacheSettings = array(
\PhpOffice\PhpSpreadsheet\Settings::setCacheStorageMethod($cacheMethod, $cacheSettings); \PhpOffice\PhpSpreadsheet\Settings::setCacheStorageMethod($cacheMethod, $cacheSettings);
``` ```
When your script terminates all entries will be cleared from APC, regardless of the cacheTime value, so it cannot be used for persistent storage using this mechanism. When your script terminates all entries will be cleared from APC,
regardless of the cacheTime value, so it cannot be used for persistent
storage using this mechanism.
### \PhpOffice\PhpSpreadsheet\CachedObjectStorageFactory::CACHE_TO_MEMCACHE ### \PhpOffice\PhpSpreadsheet\CachedObjectStorageFactory::CACHE\_TO\_MEMCACHE
When using CACHE_TO_MEMCACHE, cell objects are maintained in memcache with only an index maintained in PHP memory to identify that the cell exists. When using CACHE\_TO\_MEMCACHE, cell objects are maintained in memcache
with only an index maintained in PHP memory to identify that the cell
exists.
By default, PhpSpreadsheet looks for a memcache server on localhost at port 11211. It also sets a memcache timeout limit of 600 seconds. If you are running memcache on a different server or port, then you can change these defaults when you initialise CACHE_TO_MEMCACHE: By default, PhpSpreadsheet looks for a memcache server on localhost at
port 11211. It also sets a memcache timeout limit of 600 seconds. If you
are running memcache on a different server or port, then you can change
these defaults when you initialise CACHE\_TO\_MEMCACHE:
```php ``` php
$cacheMethod = \PhpOffice\PhpSpreadsheet\CachedObjectStorageFactory::CACHE_TO_MEMCACHE; $cacheMethod = \PhpOffice\PhpSpreadsheet\CachedObjectStorageFactory::CACHE_TO_MEMCACHE;
$cacheSettings = array( $cacheSettings = array(
'memcacheServer' => 'localhost', 'memcacheServer' => 'localhost',
@ -94,13 +144,19 @@ $cacheSettings = array(
\PhpOffice\PhpSpreadsheet\Settings::setCacheStorageMethod($cacheMethod, $cacheSettings); \PhpOffice\PhpSpreadsheet\Settings::setCacheStorageMethod($cacheMethod, $cacheSettings);
``` ```
When your script terminates all entries will be cleared from memcache, regardless of the cacheTime value, so it cannot be used for persistent storage using this mechanism. When your script terminates all entries will be cleared from memcache,
regardless of the cacheTime value, so it cannot be used for persistent
storage using this mechanism.
### \PhpOffice\PhpSpreadsheet\CachedObjectStorageFactory::CACHE_TO_WINCACHE ### \PhpOffice\PhpSpreadsheet\CachedObjectStorageFactory::CACHE\_TO\_WINCACHE
When using CACHE_TO_WINCACHE, cell objects are maintained in Wincache with only an index maintained in PHP memory to identify that the cell exists. By default, a Wincache cache timeout of 600 seconds is used, which should be enough for most applications: although it is possible to change this when initialising CACHE_TO_WINCACHE. When using CACHE\_TO\_WINCACHE, cell objects are maintained in Wincache
with only an index maintained in PHP memory to identify that the cell
exists. By default, a Wincache cache timeout of 600 seconds is used,
which should be enough for most applications: although it is possible to
change this when initialising CACHE\_TO\_WINCACHE.
```php ``` php
$cacheMethod = \PhpOffice\PhpSpreadsheet\CachedObjectStorageFactory::CACHE_TO_WINCACHE; $cacheMethod = \PhpOffice\PhpSpreadsheet\CachedObjectStorageFactory::CACHE_TO_WINCACHE;
$cacheSettings = array( $cacheSettings = array(
'cacheTime' => 600 'cacheTime' => 600
@ -108,22 +164,33 @@ $cacheSettings = array(
\PhpOffice\PhpSpreadsheet\Settings::setCacheStorageMethod($cacheMethod, $cacheSettings); \PhpOffice\PhpSpreadsheet\Settings::setCacheStorageMethod($cacheMethod, $cacheSettings);
``` ```
When your script terminates all entries will be cleared from Wincache, regardless of the cacheTime value, so it cannot be used for persistent storage using this mechanism. When your script terminates all entries will be cleared from Wincache,
regardless of the cacheTime value, so it cannot be used for persistent
storage using this mechanism.
### \PhpOffice\PhpSpreadsheet\CachedObjectStorageFactory::CACHE_TO_SQLITE ### \PhpOffice\PhpSpreadsheet\CachedObjectStorageFactory::CACHE\_TO\_SQLITE
Uses an SQLite 2 "in-memory" database for caching cell data. Unlike other caching methods, neither cells nor an index are held in PHP memory - an indexed database table makes it unnecessary to hold any index in PHP memory, which makes this the most memory-efficient of the cell caching methods. Uses an SQLite 2 "in-memory" database for caching cell data. Unlike
other caching methods, neither cells nor an index are held in PHP memory
- an indexed database table makes it unnecessary to hold any index in
PHP memory, which makes this the most memory-efficient of the cell
caching methods.
### \PhpOffice\PhpSpreadsheet\CachedObjectStorageFactory::CACHE_TO_SQLITE3; ### \PhpOffice\PhpSpreadsheet\CachedObjectStorageFactory::CACHE\_TO\_SQLITE3;
Uses an SQLite 3 "in-memory" database for caching cell data. Unlike other caching methods, neither cells nor an index are held in PHP memory - an indexed database table makes it unnecessary to hold any index in PHP memory, which makes this the most memory-efficient of the cell caching methods.
Uses an SQLite 3 "in-memory" database for caching cell data. Unlike
other caching methods, neither cells nor an index are held in PHP memory
- an indexed database table makes it unnecessary to hold any index in
PHP memory, which makes this the most memory-efficient of the cell
caching methods.
## Language/Locale ## Language/Locale
Some localisation elements have been included in PhpSpreadsheet. You can set a locale by changing the settings. To set the locale to Brazilian Portuguese you would use: Some localisation elements have been included in PhpSpreadsheet. You can
set a locale by changing the settings. To set the locale to Brazilian
Portuguese you would use:
```php ``` php
$locale = 'pt_br'; $locale = 'pt_br';
$validLocale = \PhpOffice\PhpSpreadsheet\Settings::setLocale($locale); $validLocale = \PhpOffice\PhpSpreadsheet\Settings::setLocale($locale);
if (!$validLocale) { if (!$validLocale) {
@ -131,7 +198,12 @@ if (!$validLocale) {
} }
``` ```
If Brazilian Portuguese language files aren't available, then Portuguese will be enabled instead: if Portuguese language files aren't available, then the setLocale() method will return an error, and American English (en_us) settings will be used throughout. If Brazilian Portuguese language files aren't available, then Portuguese
will be enabled instead: if Portuguese language files aren't available,
More details of the features available once a locale has been set, including a list of the languages and locales currently supported, can be found in the section of this document entitled "Locale Settings for Formulae". then the setLocale() method will return an error, and American English
(en\_us) settings will be used throughout.
More details of the features available once a locale has been set,
including a list of the languages and locales currently supported, can
be found in the section of this document entitled "Locale Settings for
Formulae".