Tutorial
Contents
Tutorial#
This example will show the simplist way to use pandas with pint and the underlying objects. It’s slightly fiddly to set up units compared to reading data and units from a file. A more typical use case is given in Reading from csv.
Imports#
First some imports
In [1]: import pandas as pd
In [2]: import pint
In [3]: import pint_pandas
In [4]: pint_pandas.show_versions()
{'numpy': '1.26.2',
'pandas': '2.1.3',
'pint': '0.22',
'pint_pandas': '0.4.dev138+g5751c0a'}
Create a DataFrame#
Next, we create a DataFrame with PintArrays as columns.
In [5]: df = pd.DataFrame(
...: {
...: "torque": pd.Series([1.0, 2.0, 2.0, 3.0], dtype="pint[lbf ft]"),
...: "angular_velocity": pd.Series([1.0, 2.0, 2.0, 3.0], dtype="pint[rpm]"),
...: }
...: )
...:
In [6]: df
Out[6]:
torque angular_velocity
0 1.0 1.0
1 2.0 2.0
2 2.0 2.0
3 3.0 3.0
DataFrame Operations#
Operations with columns are units aware so behave as we would intuitively expect.
In [7]: df["power"] = df["torque"] * df["angular_velocity"]
In [8]: df
Out[8]:
torque angular_velocity power
0 1.0 1.0 1.0
1 2.0 2.0 4.0
2 2.0 2.0 4.0
3 3.0 3.0 9.0
Note
Notice that the units are not displayed in the cells of the DataFrame. If you ever see units in the cells of the DataFrame, something isn’t right. See units_in_cells for more information.
We can see the columns’ units in the dtypes attribute
In [9]: df.dtypes
Out[9]:
torque pint[foot * force_pound]
angular_velocity pint[revolutions_per_minute]
power pint[foot * force_pound * revolutions_per_minute]
dtype: object
Each column can be accessed as a Pandas Series
In [10]: df.power
Out[10]:
0 1.0
1 4.0
2 4.0
3 9.0
Name: power, dtype: pint[foot * force_pound * revolutions_per_minute]
Which contains a PintArray
In [11]: df.power.values
Out[11]:
<PintArray>
[1.0, 4.0, 4.0, 9.0]
Length: 4, dtype: pint[foot * force_pound * revolutions_per_minute]
The PintArray contains a Quantity
In [12]: df.power.values.quantity
Out[12]: array([1., 4., 4., 9.]) <Unit('foot * force_pound * revolutions_per_minute')>
Pandas Series Accessors#
Pandas Series accessors are provided for most Quantity properties and methods. Methods that return arrays will be converted to Series.
In [13]: df.power.pint.units
Out[13]: <Unit('foot * force_pound * revolutions_per_minute')>
In [14]: df.power.pint.to("kW")
Out[14]:
0 0.00014198092353610379
1 0.0005679236941444151
2 0.0005679236941444151
3 0.001277828311824934
Name: power, dtype: pint[kilowatt]