Tutorial#

This example will show the simplist way to use pandas with pint and the underlying objects. It’s slightly fiddly to set up units compared to reading data and units from a file. A more typical use case is given in Reading from csv.

Imports#

First some imports

In [1]: import pandas as pd

In [2]: import pint

In [3]: import pint_pandas

In [4]: pint_pandas.show_versions()
{'numpy': '1.26.2',
 'pandas': '2.1.3',
 'pint': '0.22',
 'pint_pandas': '0.4.dev138+g5751c0a'}

Create a DataFrame#

Next, we create a DataFrame with PintArrays as columns.

In [5]: df = pd.DataFrame(
   ...:    {
   ...:       "torque": pd.Series([1.0, 2.0, 2.0, 3.0], dtype="pint[lbf ft]"),
   ...:       "angular_velocity": pd.Series([1.0, 2.0, 2.0, 3.0], dtype="pint[rpm]"),
   ...:    }
   ...: )
   ...: 

In [6]: df
Out[6]: 
   torque  angular_velocity
0     1.0               1.0
1     2.0               2.0
2     2.0               2.0
3     3.0               3.0

DataFrame Operations#

Operations with columns are units aware so behave as we would intuitively expect.

In [7]: df["power"] = df["torque"] * df["angular_velocity"]

In [8]: df
Out[8]: 
   torque  angular_velocity  power
0     1.0               1.0    1.0
1     2.0               2.0    4.0
2     2.0               2.0    4.0
3     3.0               3.0    9.0

Note

Notice that the units are not displayed in the cells of the DataFrame. If you ever see units in the cells of the DataFrame, something isn’t right. See units_in_cells for more information.

We can see the columns’ units in the dtypes attribute

In [9]: df.dtypes
Out[9]: 
torque                                       pint[foot * force_pound]
angular_velocity                         pint[revolutions_per_minute]
power               pint[foot * force_pound * revolutions_per_minute]
dtype: object

Each column can be accessed as a Pandas Series

In [10]: df.power
Out[10]: 
0    1.0
1    4.0
2    4.0
3    9.0
Name: power, dtype: pint[foot * force_pound * revolutions_per_minute]

Which contains a PintArray

In [11]: df.power.values
Out[11]: 
<PintArray>
[1.0, 4.0, 4.0, 9.0]
Length: 4, dtype: pint[foot * force_pound * revolutions_per_minute]

The PintArray contains a Quantity

In [12]: df.power.values.quantity
Out[12]: array([1., 4., 4., 9.]) <Unit('foot * force_pound * revolutions_per_minute')>

Pandas Series Accessors#

Pandas Series accessors are provided for most Quantity properties and methods. Methods that return arrays will be converted to Series.

In [13]: df.power.pint.units
Out[13]: <Unit('foot * force_pound * revolutions_per_minute')>

In [14]: df.power.pint.to("kW")
Out[14]: 
0    0.00014198092353610379
1     0.0005679236941444151
2     0.0005679236941444151
3      0.001277828311824934
Name: power, dtype: pint[kilowatt]