python xlrd unsupported format, or corrupt file.
import xlrd wb = xlrd.open_workbook("Z:\\Data\\Locates\\3.8 locates.xls") sh = wb.sheet_by_index(0) print sh.cell(0,0).value
Traceback (most recent call last): File "Z:\Wilson\tradedStockStatus.py", line 18, in <module> wb = xlrd.open_workbook("Z:\\Data\\Locates\\3.8 locates.xls") File "C:\Python27\lib\site-packages\xlrd\__init__.py", line 429, in open_workbook biff_version = bk.getbof(XL_WORKBOOK_GLOBALS) File "C:\Python27\lib\site-packages\xlrd\__init__.py", line 1545, in getbof bof_error('Expected BOF record; found %r' % self.mem[savpos:savpos+8]) File "C:\Python27\lib\site-packages\xlrd\__init__.py", line 1539, in bof_error raise XLRDError('Unsupported format, or corrupt file: ' + msg) xlrd.biffh.XLRDError: Unsupported format, or corrupt file: Expected BOF record; found '<table r'"
The file doesn't seem to be corrupted or of a different format. Anything to help find the source of the issue would be great.
The file doesn't seem to be corrupted or of a different format.
However as the error message says, the first 8 bytes of the file are '<table r' ... that is definitely not Excel .xls format. Open it with a text editor (e.g. Notepad) that won't take any notice of the (incorrect) .xls extension and see for yourself.
Try to open it with pandas:
import pandas as pd data = pd.read_html('filename.xls')
Or try any other html python parser.
That's not a proper excel file, but an html readable with excel.
I had a similar problem and it was related to the version. In a python terminal check:
>> import xlrd >> xlrd.__VERSION__
If you have '0.9.0' you can open almost all files. If you have '0.6.0' which was what I found on Ubuntu, you may have problems with newest Excel files. You can download the latest version of xlrd using the Distutils standard.
This will happen to some files while also open in Excel.
I found the similar problem when downloading .xls file and opened it using xlrd library. Then I tried out the solution of converting .xls into .xlsx as detailed here: how to convert xls to xlsx
It works like a charm and rather than opening .xls, I am working with .xlsx file now using openpyxl library.
Hope it helps to solve your issue.
In my case, after opening the file with a text editor as @john-machin suggested, I realized the file is not encrypted as an Excel file is supposed to but it's in the CSV format and was saved as an Excel file. What I did was renamed the file and its extension and used read_csv function instead:
os.rename('sample_file.xls', 'sample_file.csv') csv = pd.read_csv("sample_file.csv", error_bad_lines=False)
I just downloaded xlrd, created an excel document (excel 2007) for testing and got the same error (message says 'found PK\x03\x04\x14\x00\x06\x00'). Extension is a xlsx. Tried saving it to an older .xls format and error disappears .....
I meet the same problem.
it lies in the .xls file itself - it looks like an Excel file however it isn't. (see if there's a pop up when you plainly open the .xls from Excel)
sjmachin commented on Jan 19, 2013 from https://github.com/python-excel/xlrd/issues/26 helps.
I met this problem too.I opened this file by excel and saved it as other formats such as excel 97-2003 and finally I solved this problem
I had the same issue. Those old files are formatted like a tab-delimited file. I've been able to open my problem files with read_table; ie df = pd.read_table('trouble_maker.xls').
there's nothing wrong with your file. xlrd does not yet support xlsx (excel 2007+) files although it's purported to have supported this for some time.
2-days ago they committed a pre-alpha version to their git which integrates xlsx support. Other forums suggest that you use a DOM parser for xlsx files since the xlsx file type is just a zip archive containing XML. I have not tried this. there is another package with similar functionality as xlrd and this is called openpyxl which you can get from easy_install or pip. I have not tried this either, however, its API is supposed to be similar to xlrd.