Version 0.6.1, 2007-06-10
- Version number updated to 1.6.1
- Documented runxlrd.py commands in its usage message. Changed commands: dump to biff_dump, count_records to biff_count.
- Bug fixed: Missing "<" in a struct.unpack call means can't open files on bigendian platforms. Discovered by "Mihalis".
- Removed antique undocumented Book.get_name_dict method and experimental "trimming" facility.
- Meaningful exception instead of IndexError if a SAT (sector allocation table) is corrupted.
- If no CODEPAGE record in pre-8.0 file, assume ascii and keep going (instead of raising exception).
- At least one source of XLS files writes parent style XF records *after* the child cell
XF records that refer to them, triggering IndexError in 0.5.2 and AssertionError in
Reported with sample file by Todd O'Bryan.
Fixed by changing to two-pass processing of XF records.
- Formatting info in pre-BIFF8 files: Ensured appropriate defaults and lossless conversions to make
the info BIFF8-compatible. Fixed bug in extracting the "used" flags.
- Fixed problems discovered with opening test files from Planmaker 2006
(1) Four files have reduced size of PALETTE record
(51 and 32 colours; Excel writes 56 always). xlrd now emits a NOTE to the logfile and continues.
(2) FORMULA records use the Excel 2.x record code 0x0021 instead of 0x0221. xlrd now continues silently.
(3) In two files, at the OLE2 compound document level, the internal directory says that the length of
the Short-Stream Container Stream is 16384 bytes, but the actual contents are 11264 and 9728 bytes respectively.
xlrd now emits a WARNING to the logfile and continues.
- After discussion with Daniel Rentz, the concept of two lists of XF (eXtended Format) objects
(raw_xf_list and computed_xf_list) has been abandoned. There is now a single list, called xf_list
- Added Book.sheets ... for sheetx, sheet in enumerate(book.sheets):
- Formatting info: extraction of sheet-level flags from WINDOW2 record, and sheet.visibility
from BOUNDSHEET record. Added Macintosh-only Font attributes "outline" and "shadow'.
- Added extraction of merged cells info.
- pyExcelerator uses "general" instead of "General" for the generic "number format". Worked around.
- Crystal Reports writes "WORKBOOK" in the OLE2 Compound Document directory instead of "Workbook".
Changed to case-insensitive directory search. Reported by Vic Simkus.
Version 0.6.1a1, 2006-12-18
- Added formatting information for cells (font, "number format", background, border, alignment and protection)
and rows/columns (height/width etc). To save memory and time for those who don't need it,
this information is extracted only if formatting_info=1 is supplied
to the open_workbook() function. The cell records BLANK and MULBLANKS
which contain no data, only formatting information, will continue to be ignored
in the default (no formatting info) case.
- Ralph Heimburger reported a problem with xlrd being intolerant
about an Excel 4.0 file (created by "some web app") with a DIMENSIONS record
that omitted Microsoft's usual padding with 2 unused bytes. Fixed.
Version 0.6.0a4, not released
- Added extraction of human-readable formulas from NAME records.
- Worked around OOo Calc writing 9-byte BOOLERR records instead of 8. Reported by Rory Campbell-Lange.
- This history file converted to descending chronological order and HTML format.
Version 0.6.0a3, 2006-09-19
- Names: minor bugfixes; added script xlrdnameAPIdemo.py
- ROW records were being used as additional hints for sizing memory requirements. In some
files the ROW records overstate the number of used columns, and/or there are ROW records for
rows that have no data in them. This would cause xlrd to report sheet.ncols and/or sheet.nrows
as larger than reasonably expected. Change: ROW records are ignored. The number of columns/rows is
based solely on the highest column/row index seen in non-empty data records. Empty data records (types
BLANK and MULBLANKS) which contain no data, only formatting information, have always been ignored, and
this will continue. Consequence: trailing rows and columns which contain only empty cells will
Version 0.6.0a2, 2006-09-13
- Fixed a bug reported by Rory Campbell-Lange.: "open failed"; incorrect assumptions about the layout
of array formulas which return strings.
- Further work on defined names, especially the API.
Version 0.6.0a1, 2006-09-08
- Sheet objects have two new convenience methods: col_values(colx, start_rowx=0, end_rowx=None)
and the corresponding col_types. Suggested by Dennis O'Brien.
- BIFF 8 file missing its CODEPAGE record: xlrd will now assume utf_16_le encoding
(the only possibility) and keep going.
- Older files missing a CODEPAGE record: an exception will be raised.
Thanks to Sergey Krushinsky for a sample file.
The open_workbook() function has a new argument (encoding_override) which can
be used if the CODEPAGE record is missing or incorrect (for example, codepage=1251
but the data is actually encoded in koi8_r). The runxlrd.py script takes a
corresponding -e argument, for example -e cp1251
- Further work done on parsing "number formats". Thanks to Chris Withers for the
- Excel 97 introduced the concept of row and column labels, defined by Insert > Name > Labels.
The ranges containing the labels are now exposed as the Sheet attributes
row_label_ranges and col_label_ranges.
- The major effort in this 0.6.0 release has been the provision of access
to named cell ranges and named constants (Excel: Insert/Name/Define).
Juan C. Méndez provided very useful real-world sample files.
Version 0.5.3a1, 2006-05-24
- John Popplewell and Richard Sharp provided sample files which caused any
reliance at all on DIMENSIONS records and ROW records to be abandoned.
- If the file size is not a whole number of OLE sectors, a warning message is logged.
Previously this caused an exception to be raised.
Version 0.5.2, 2006-03-14, public release
- Updated version numbers, README, HISTORY.
Version 0.5.2a3, 2006-03-13
- Gnumeric writes user-defined formats with format codes starting at
50 instead of 164; worked around.
- Thanks to Didrik Pinte for reporting the need for xlrd to be more tolerant
of the idiosyncracies of other software, for supplying sample files,
and for performing alpha testing.
- '_' character in a format should be treated like an escape character; fixed.
- An "empty" formula result means a zero-length string, not an empty cell! Fixed.
Version 0.5.2a2, 2006-03-09
- Found that Gnumeric writes all DIMENSIONS records with nrows and ncols
each 1 less than they should be (except when it clamps ncols at 256!),
and pyXLwriter doesn't write ROW records. Cell memory pre-allocation was
generalised to use ROW records if available with fall-back to DIMENSIONS records.
Version 0.5.2a1, 2006-03-06
- pyXLwriter writes DIMENSIONS record with antique opcode 0x0000
instead of 0x0200; worked around
- A file written by Gnumeric had zeroes in DIMENSIONS record
but data in cell A1; worked around
Version 0.5.1, 2006-02-18, released to Journyx
- Python 2.1 mmap requires file to be opened for update access.
Added fall-back to read-only access without mmap if 2.1 open fails
because "permission denied".
Version 0.5, 2006-02-07, released to Journyx
- Now works with Python 2.1. Backporting to Python 2.1 was partially
funded by Journyx - provider of timesheet and project accounting
- open_workbook() can be given the contents of a file
instead of its name. Thanks to Remco Boerma for the suggestion.
- New module attribute __VERSION__ (as a string; for example "0.5")
- Minor enhancements to classification of formats as date or not-date.
- Added warnings about files with inconsistent OLE compound document
structures. Thanks to Roman V. Kiseliov (author of pyExcelerator)
for the tip-off.
Version 0.4a1, 2005-09-07, released to Laurent T.
- Book and sheet objects can now be pickled and unpickled.
Instead of reading a large spreadsheet multiple times,
consider pickling it once and loading the saved pickle;
can be much faster. Thanks to Laurent Thioudellet for the
- Using the mmap module can be turned off.
But you would only do that for benchmarking purposes.
- Handling NUMBER records has been made faster
Version 0.3a1, 2005-05-15, first public release