20

编辑日期: 2024-11-28 文章阅读: 次

小技巧 20：如何快速拿到数据最多的 3 个分类？

读入数据：

df = pd.read_csv("IMDB-Movie-Data.csv")
df

1000 行数据，genre 取值的频次统计如下：

vc = df["genre"].value_counts()
vc

打印结果：

Action,Adventure,Sci-Fi       50
Drama                         48
Comedy,Drama,Romance          35
Comedy                        32
Drama,Romance                 31
                              ..
Adventure,Comedy,Fantasy       1
Biography,History,Thriller     1
Action,Horror                  1
Mystery,Thriller,Western       1
Animation,Fantasy              1
Name: genre, Length: 207, dtype: int64

筛选出 top3 的 index:

top_genre = vc[0:3].index
print(top_genre)

打印结果：

Index(['Action,Adventure,Sci-Fi', 'Drama', \
       'Comedy,Drama,Romance'], dtype='object')

使用得到的 top3 的 index ，结合 isin,选择出相应的 df

df_top = df[df["genre"].isin(top_genre)]
df_top

结果：

Site Views: Visitors:

AI之家

🔥AI副业赚钱星球

点击下面图片查看

🔥ChatGPT-4在线使用

Python和AI在线练习

AI之家教程

20

小技巧 20：如何快速拿到数据最多的 3 个分类？