提问者:小点点

根据日期在数据框中创建新列


我是python新手,我想添加一个基于日期的列,其中包含几个条件。我的数据来自一张纸。

当前我的代码如下所示:

#Save data in a DataFrame

df1 = pd.read_excel(stream_io1, sheet_name = "Sheet1", header=0)

# Use Accounting Date 
df1['Item - Accounting Date'] = pd.to_datetime(df1['Item - Accounting Date'], format='%Y-%m-%d')

def condition(row):
    if (row['Item - Accounting Date'] < '2020-01-01') in row['Item - Accounting Date']:
        return "<2020"
    if "2020" in row['Item - Accounting Date']:
        return "2020"
    if (row[(row['Item - Accounting Date'] >= "01/01/2021") & (row['Item - Accounting Date'] <="30/06/2021")]) in row['Item - Accounting Date']:
        return "S1 2021"    
    if (row[(row['Item - Accounting Date'] > "30/06/2021") & (row['Item - Accounting Date'] <="31/12/2021")]) in row['Item - Accounting Date']:
        return "S2 2021" 

df1['Année'] = df1.apply(condition, axis = 1)

我有这个错误消息:

打字错误:'

我理解错误但我不知道如何解决


共2个答案

匿名用户

from datetime import datetime
datetime.strptime("2020-01-01","%Y-%m-%d")

这就是如何将字符串转换为datetime对象

匿名用户

似乎您只需要将条件函数应用于一列,因此您可以使用pd.to_datetime进行如下修复:

def condition(row):
    if row < pd.to_datetime('2020-01-01'):
        return "<2020"
    if "2020" in row:
        return "2020"
    if (row >= pd.to_datetime("01/01/2021")) & (row <=pd.to_datetime("30/06/2021")):
        return "S1 2021"    
    if (row > pd.to_datetime("30/06/2021")) & (row <=pd.to_datetime("31/12/2021")):
        return "S2 2021" 

df1['Année'] = df1['Item - Accounting Date'].apply(condition)