pandas - How to use MultiLabelBinarizer for Multilabel classification?

Question

Welcome To Ask or Share your Answers For Others

pandas - How to use MultiLabelBinarizer for Multilabel classification?

posted Feb 19, 2021 in Technique[技术] by 深蓝 (71.8m points)

pandas - How to use MultiLabelBinarizer for Multilabel classification?

I am trying to do multilabel classification. But I am really stuck at data preprocessing. My target data in in a separate file. The target data looks like this

   Id              Tag
0   1             data
1   4               c#
2   4         winforms
3   4  type-conversion
4   4          decimal

I am trying to use MultiLabelBinarizer to preprocess the data. At the end of it, I want it to look something like this -

ID	data	c#	winforms	type-conversion	decimal
1	1	0	0	0	0
4	0	1	1	1	1

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-02-19T03:40:09+0000

You are not doing anything wrong. MultiLabelBinarizer(), as most other sklearn stuff, returns numpy arrays. In this case, the underlying data looks identical to your expected output, sans the ID and Tag names.

Use pd.crosstab instead:

pd.crosstab(df['Id'], df['Tag'])

Categories

pandas - How to use MultiLabelBinarizer for Multilabel classification?

pandas - How to use MultiLabelBinarizer for Multilabel classification?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags