Read and Write Excel Files with Python
How to read, write, and format Excel spreadsheets using openpyxl and pandas in Python.
Note: This guide follows English-language naming conventions and terminology standards common in international development teams. Examples use English identifiers and comments to maximize compatibility across codebases and tooling.
Overview
Excel files (.xlsx) are everywhere in business. Python can read, write, and format them programmatically using openpyxl (cell-level control) and pandas (data-frame operations). This recipe covers both approaches for common tasks like reading sheets, writing data, applying formatting, and handling multi-sheet workbooks.
When to Use
- You need to read data from Excel files exported by business tools
- You are generating Excel reports from a database or API
- You need to format cells (colors, borders, number formats) programmatically
- You are automating a workflow that involves multiple Excel sheets
Solution
Reading Excel with pandas
import pandas as pd
# Read a single sheet
df = pd.read_excel("data.xlsx", sheet_name="Sheet1")
print(df.head())
print(df.columns)
# Read all sheets into a dict of DataFrames
sheets = pd.read_excel("data.xlsx", sheet_name=None)
for name, df in sheets.items():
print(f"Sheet: {name}, rows: {len(df)}")
Writing Excel with pandas
import pandas as pd
df = pd.DataFrame({
"name": ["Alice", "Bob", "Charlie"],
"score": [85, 92, 78],
})
# Basic write
df.to_excel("output.xlsx", index=False, sheet_name="Results")
# Multiple sheets
with pd.ExcelWriter("report.xlsx") as writer:
df.to_excel(writer, sheet_name="Summary", index=False)
df[df["score"] > 80].to_excel(writer, sheet_name="High Scores", index=False)
Cell-level control with openpyxl
from openpyxl import Workbook
from openpyxl.styles import Font, PatternFill, Alignment, Border, Side
wb = Workbook()
ws = wb.active
ws.title = "Report"
# Header row with styling
headers = ["Name", "Score", "Grade"]
header_fill = PatternFill(start_color="1a56db", end_color="1a56db", fill_type="solid")
header_font = Font(color="FFFFFF", bold=True)
for col, header in enumerate(headers, 1):
cell = ws.cell(row=1, column=col, value=header)
cell.fill = header_fill
cell.font = header_font
cell.alignment = Alignment(horizontal="center")
# Data rows
data = [("Alice", 85, "B"), ("Bob", 92, "A"), ("Charlie", 78, "C")]
for row_idx, (name, score, grade) in enumerate(data, 2):
ws.cell(row=row_idx, column=1, value=name)
ws.cell(row=row_idx, column=2, value=score)
ws.cell(row=row_idx, column=3, value=grade)
# Auto-size columns
for col in ws.columns:
max_length = max(len(str(cell.value or "")) for cell in col)
ws.column_dimensions[col[0].column_letter].width = max_length + 2
wb.save("formatted_report.xlsx")
Reading with openpyxl
from openpyxl import load_workbook
wb = load_workbook("data.xlsx", data_only=True) # data_only reads computed values
ws = wb["Sheet1"]
for row in ws.iter_rows(min_row=1, max_row=5, values_only=True):
print(row)
# Access a specific cell
print(ws["A1"].value)
Adding formulas
from openpyxl import Workbook
wb = Workbook()
ws = wb.active
ws["A1"] = 10
ws["A2"] = 20
ws["A3"] = 30
ws["A4"] = "=SUM(A1:A3)"
ws["A5"] = "=AVERAGE(A1:A3)"
wb.save("formulas.xlsx")
Explanation
pandas wraps openpyxl under the hood when reading and writing .xlsx files. Use pandas for data-centric operations (filtering, grouping, joining) and openpyxl when you need cell-level control (formatting, formulas, merged cells, charts).
Key differences:
pd.read_excelreturns a DataFrame. Good for analysis but loses formatting.openpyxl.load_workbookpreserves formatting and gives you cell objects. Slower for large files.pd.ExcelWriterwithengine="openpyxl"lets you write DataFrames while preserving an existing workbook’s formatting.
Variants
| Library | Level | Best For | Dependencies |
|---|---|---|---|
| pandas | DataFrame | Data analysis, bulk read/write | pandas, openpyxl |
| openpyxl | Cell | Formatting, formulas, charts | openpyxl |
| xlsxwriter | Cell | Writing only, charts, conditional formatting | xlsxwriter |
| xlrd | Read-only | Legacy .xls files | xlrd |
Guidelines
- Use pandas for reading and writing data. Use openpyxl for formatting and formulas.
- Always pass
index=Falsetoto_excelunless you need the index column. - Use
data_only=Truewithload_workbookto read computed values instead of formula strings. - Set column widths explicitly. openpyxl does not auto-fit columns.
- Use
pd.ExcelWritercontext manager to write multiple sheets in one file.
Common Mistakes
- Forgetting to install openpyxl. pandas needs it as an engine for .xlsx files.
- Using
openpyxlfor large files (10k+ rows). It is slow; use pandas for bulk operations. - Not passing
data_only=Truewhen reading formulas. You get the formula string instead of the result. - Overwriting an existing workbook with
to_excel. It replaces the file; useExcelWriterwithmode="a"to append. - Ignoring number formats. Excel may display dates and numbers differently than Python expects.
Frequently Asked Questions
How do I read a specific range of cells?
With openpyxl, use ws.iter_rows(min_row=2, max_row=10, min_col=1, max_col=3, values_only=True). With pandas, use usecols and skiprows parameters.
How do I add conditional formatting?
Use openpyxl.formatting.rule or xlsxwriter. For example, color scales and data bars are supported via ColorScaleRule and DataBarRule.
How do I handle .xls (legacy) files?
Use xlrd for reading and xlwt for writing. pandas supports them with engine="xlrd" and engine="xlwt". Note that xlrd dropped .xlsx support in version 2.0.
Can I create charts in Excel with Python?
Yes. openpyxl.chart supports bar, line, and pie charts. xlsxwriter also supports charts with a similar API.
Related Resources
Parse CSV Files with Python and Pandas
How to read, filter, and transform large CSV files efficiently using Python pandas and the csv module.
RecipeConvert CSV to JSON
How to convert CSV data to JSON format in Python, Java, and JavaScript.
RecipeConvert JSON to CSV
How to convert JSON data to CSV format in Python, Java, and JavaScript.
RecipeGenerate PDF Reports with Python
Create styled PDF documents from data using ReportLab and fpdf2 in Python.
RecipeMerge JSON Files
How to merge multiple JSON files into a single object or array in Python, Java, and JavaScript.