Of course! Resampling OHLC (Open, High, Low, Close) data is a fundamental task in financial analysis with Python. The primary library for this is pandas, which has a powerful built-in .resample() method.

Here’s a complete guide covering:
- Why Resample OHLC Data? (Common use cases)
- The Core Challenge: Why you can't just use a standard aggregation function like
mean(). - The Solution: Using
.resample()with custom aggregation functions. - Complete Code Examples: From creating sample data to performing various resampling tasks.
- Alternative (More Flexible) Method: Using
pd.Grouper.
Why Resample OHLC Data?
Traders and analysts often work with data at different frequencies. Resampling allows you to convert data from one time frame to another.
- From Lower to Higher Frequency (Upsampling):
Convert 1-minute data to 10-second data. This usually involves forward-filling or interpolating values, as there isn't always a trade in every 10-second bucket.
- From Higher to Lower Frequency (Downsampling):
- Convert 1-minute data to 5-minute data. This is the most common use case. You need to calculate the Open, High, Low, and Close for each new 5-minute interval based on the original 1-minute data within it.
- Convert hourly data to daily data.
- Convert daily data to weekly or monthly data.
The Core Challenge: Aggregation is Not Simple
If you have a list of numbers and want to find the average, you just sum them and divide by the count. OHLC data is different.

- Open: The price of the first trade in the new period.
- Close: The price of the last trade in the new period.
- High: The maximum price reached during the new period.
- Low: The minimum price reached during the new period.
A simple mean() or sum() doesn't make sense for these columns. You must apply specific aggregation functions to each column.
The Solution: resample().agg()
The pandas solution is a two-step process:
.resample(): This object groups your time series data into bins (e.g., 5-minute bins)..agg(): This method applies one or more aggregation functions to each column of the grouped data.
You provide a dictionary to .agg() where the keys are the column names and the values are the aggregation functions to use.
Complete Code Examples
Let's walk through a full example.

Step 1: Setup and Create Sample Data
First, let's install pandas if you haven't already and create some sample 1-minute OHLC data.
pip install pandas
import pandas as pd
import numpy as np
# Create a date range for our sample data
# Let's create 1-minute data for one business day
date_rng = pd.date_range(start='2025-10-26 09:30:00', end='2025-10-26 16:00:00', freq='1min')
# Create a DataFrame with random OHLC data
# In a real scenario, you would load this from a CSV or API
np.random.seed(42) # for reproducibility
n = len(date_rng)
ohlc_data = pd.DataFrame({
'open': np.random.uniform(150, 155, n),
'high': np.random.uniform(155, 160, n),
'low': np.random.uniform(148, 153, n),
'close': np.random.uniform(151, 157, n),
'volume': np.random.randint(1000, 10000, n)
}, index=date_rng)
# Ensure high is always >= open, close, low and low is always <= open, close, high
ohlc_data['high'] = ohlc_data[['open', 'high', 'close']].max(axis=1)
ohlc_data['low'] = ohlc_data[['open', 'low', 'close']].min(axis=1)
print("--- Original 1-Minute Data ---")
print(ohlc_data.head())
Step 2: Resample to 5-Minute Bars (Downsampling)
This is the most common and important operation. We want to create 5-minute OHLC bars from our 1-minute data.
# Define the aggregation rules for each column
agg_rules = {
'open': 'first', # The first 'open' in the 5-min period
'high': 'max', # The highest 'high' in the 5-min period
'low': 'min', # The lowest 'low' in the 5-min period
'close': 'last', # The last 'close' in the 5-min period
'volume': 'sum' # The sum of all volumes in the 5-min period
}
# Resample the data to 5-minute intervals
five_min_bars = ohlc_data.resample('5T').agg(agg_rules)
print("\n--- Resampled 5-Minute Data ---")
print(five_min_bars.head())
Explanation of Aggregation Functions:
'first'foropen: Gets the first value of theopencolumn within each 5-minute bin.'last'forclose: Gets the last value of theclosecolumn.'max'forhigh: Gets the maximum value.'min'forlow: Gets the minimum value.'sum'forvolume: Sums up all the trades in the period.
Common Time Aliases for Resampling:
Torminfor minutesHorhfor hoursDfor calendar daysBfor business days (Mon-Fri)Wfor weekly (Sunday)Mfor month-endYfor year-end
Step 3: Resample to Daily Bars
The process is identical, you just change the resampling frequency.
# Resample the data to daily (business day) intervals
daily_bars = ohlc_data.resample('B').agg(agg_rules)
print("\n--- Resampled Daily Data ---")
print(daily_bars.head())
Step 4: Resample to Hourly Bars
# Resample the data to hourly intervals
hourly_bars = ohlc_data.resample('H').agg(agg_rules)
print("\n--- Resampled Hourly Data ---")
print(hourly_bars.head())
Alternative Method: Using pd.Grouper
The .resample() method is concise and perfect for regular time series. However, for more complex grouping (e.g., grouping by business month end, or grouping by a specific timezone), pd.Grouper is a more flexible and powerful alternative.
The syntax is slightly different but achieves the same result.
# Define the aggregation rules
agg_rules = {
'open': 'first',
'high': 'max',
'low': 'min',
'close': 'last',
'volume': 'sum'
}
# Use pd.Grouper to group by 5-minute intervals
# key='...' tells Grouper which column to use (the index in this case)
# freq='...' specifies the frequency
five_min_bars_grouper = ohlc_data.groupby(pd.Grouper(freq='5T')).agg(agg_rules)
print("\n--- 5-Minute Data using pd.Grouper ---")
print(five_min_bars_grouper.head())
When to use which?
- Use
.resample()for most time-based downsampling. It's idiomaticpandasand very readable. - Use
pd.Grouper()when you need more flexibility, such as grouping by multiple criteria (e.g., date and a symbol column) or using more complex frequencies.
Summary and Best Practices
- Always use
.agg()with a dictionary when resampling OHLC data. This is the correct way to apply different functions to different columns. - Standard Aggregations:
first,last,max,min,sumare your primary tools. - Handling Missing Data: If your data has gaps (e.g., no trades over a weekend),
resamplewill create empty rows. You can fill them using.fillna()if needed. For example, to forward-fill the 'close' price:five_min_bars['close'] = five_min_bars['close'].fillna(method='ffill'). - Data Integrity: After resampling, it's good practice to ensure that
high >= max(open, close)andlow <= min(open, close)for each bar, as floating-point arithmetic or data errors can sometimes violate this.
