Comparing to None in Python and Pandas
Truthy and falsy values, None, and comparison in Python and Pandas.
Written by Reka Horvath on

Photo by Evan Buchholz on Unsplash
Missing data are a frequent source of headache (and bugs 🐛). Often, it’s far from obvious whether a value can be empty. And if it can, it usually means introducing several conditionals in the code and edge cases in the tests.
Truthy vs Falsy Values and None in Python
The Concept of Truthy and Falsy
To make these conditionals checking for missing data more concise, many programming languages, incl. Python, have a concept of “truthy” and “falsy” values. Thanks to this, various non-boolean data types can be interpreted in boolean contexts. It’s important to distinguish between:
- The literal
True
orFalse
values: These are boolean. - Truthy or falsy values: These can be any boolean or non-boolean data type.
By default, the following are considered “falsy” in Python:
- Constants defined to be false:
None
andFalse
. - Zero of any numeric type:
0
,0.0
,0j
,Decimal(0)
,Fraction(0, 1)
- Empty sequences and collections:
''
,()
,[]
,{}
,set()
,range(0)
- Note that an empty string is also considered an empty collection.
Any other value is considered “truthy”.
Comparisons
The concept of truthy and falsy values has a big benefit: It allows you to use non-boolean expressions in conditions and other boolean operations. This makes the code more concise.
data = []
if data:
print("Data is truthy!") # This won't print.
else:
print("Data is falsy!") # This will print.
data
is an empty list, which is considered falsy. => The else
statement gets
executed.
Falsy Values with Special Meaning
The concise comparison above has one big assumption: That all falsy values
should produce the same behavior. It doesn’t work anymore if a falsy value has a
special meaning. E.g. if the code needs to work differently for None
and an
empty string.
This is usually a bad practice, because it can easily lead to confusion and bugs 🐛. For example, Django’s documentation discourages using multiple values for “no data”:
Avoid using null on string-based fields such as CharField and TextField. If a string-based field has null=True, that means it has two possible values for “no data”: NULL, and the empty string. In most cases, it’s redundant to have two possible values for “no data;” the Django convention is to use the empty string, not NULL.
Django 4.2 Documentation / Model field reference / Field options
Comparing to None (if You Absolutely Have To)
What if (after considering the trade-offs above) you’ve decided to give a
special meaning to a falsy value? How to compare whether a value is actually
None
?
There are 2 ways to achieve this:
- using the equality (
==
) operator ❌ - using
is
✅
PEP 8 has a clear recommendation:
Comparisons to singletons like None should always be done with is or is not, never the equality operators.
The problem with using ==
is that it’s possible for a class to override the
__eq__
method (which determines the behavior of ==
), which could lead to
unexpected results.
class AlwaysEqual:
def __eq__(self, other):
return True
object = AlwaysEqual()
print(object == None) # prints: True
print(object is None) # prints: False
The AlwaysEqual
class overrides the __eq__
method to always return True.
Therefore, even though the object isn’t None
, when compared to None
using
==
, it returns True
.
Pandas: Use isna
Similarly to the comparison to None
in Python, there are 2 ways to detect
missing values in Pandas. And one is clearly preferred:
- Comparing to
numpy.nan
with the equality operator==
❌ - Using the
isna
orisnull
function ✅
An equality check with the ==
equals operator, like df['column'] == np.nan
,
behaves differently than what you might expect. This stems from a peculiar
property of numpy.nan
: It is not considered equal to any value, even itself.
(Note that this is a difference to Python’s None
. It’s a singleton, so
None==None
returns True
.)
Let’s consider this DataFrame
with 2 columns A
and B
as an example:
data = {"A": [1, 2, np.nan, 4], "B": [9, 10, 11, 12]}
df = pd.DataFrame(data)
Comparison with == numpy.nan
print(df["A"] == np.nan)
The output:
0 False
1 False
2 False
3 False
Name: A, dtype: bool
The returned value is always False
, even for np.nan
.
Comparison with isna()
print(df["A"].isna())
The output:
0 False
1 False
2 True
3 False
Name: A, dtype: bool
The returned value is:
True
fornp.nan
False
for every other value.
For more info, check out Pandas Docs / Missing Data
Summary
Dealing with missing and empty values is tricky. In this post, we’ve discussed 3 guidelines, that make it less error-prone:
- Don’t assign special meaning to falsy values.
- When comparing to
None
in Python, useis
oris not
. - When looking for missing values in Pandas, use the
isna
orisnull
functions.
Resources
- PEP 8 / Programming Recommendations
- Django 4.2 Documentation / Model field reference / Field options
- Pandas Docs / Missing Data