Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
488 views
in Technique[技术] by (71.8m points)

c++ - The label type must be float if you want to read the xml files of random forest(opencv3.0)

#include <opencv2/core.hpp>
#include <opencv2/ml.hpp>

#include <iostream>
#include <vector>

int main()
{
    size_t const FeatureSize = 24;
    {
        auto rtrees = cv::ml::RTrees::create();
        rtrees->setMaxDepth(10);
        rtrees->setMinSampleCount(2);
        rtrees->setRegressionAccuracy(0);
        rtrees->setUseSurrogates(false);
        rtrees->setMaxCategories(16);
        rtrees->setPriors(cv::Mat());
        rtrees->setCalculateVarImportance(false);
        rtrees->setActiveVarCount(0);
        rtrees->setTermCriteria({ cv::TermCriteria::MAX_ITER, 100, 0 });

        std::vector<float> labels; //#1
        cv::Mat_<float> features;        
        for(size_t i = 0; i != 500; ++i){
            std::vector<float> data;
            for(size_t j = 0; j != FeatureSize; ++j){
                data.emplace_back(0); //#2
            }
            labels.emplace_back(i % 2);
            features.push_back(cv::Mat(data, true));
        }                

        rtrees->train(features.reshape(1, labels.size()),
                      cv::ml::ROW_SAMPLE, labels);
        rtrees->write(cv::FileStorage("smoke_classifier.xml",
                                  cv::FileStorage::WRITE));
    }

    {
        auto rtrees2 = cv::ml::RTrees::create();

        cv::FileStorage read("smoke_classifier.xml",
                             cv::FileStorage::READ);
        rtrees2->read(read.root());

        int a = rtrees2->getMinSampleCount();
        std::cout<<"a == "<<a<<"
";
        cv::Mat1f feat2(1, FeatureSize, 0.f);
        std::cout<<"predict == "<<rtrees2->predict(feat2)<<"
";
    }  
} 

If you change #1 from float to int and read the xml then call predict, the program will crash, but if I do not read the information from xml, the function predict can work can be done even the #1 type is int

But if I change the labels from int to float, the rtree will pop out another error messages when I call train to train the machine(the dummy data "0" of the code snippet(#2) will not cause the program crash, but the real data will).

The other problem is, change the labels from int to float will make it from classification to regression problem, but what I really need is classification but not regression(although it is easy to mimic classification by regression since there are only two labels)

The error messages when I change the labels to float and call train to train the machine

"....opencv-3.0.0sourcesmodulesmlsrcree.cpp:1190: error: (-215) (int)_sleft.size() < n && (int)_sright.size() < n in function cv::ml::DTreesImpl::calcDir"

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The relevant code is in tree.cpp.

When using int labels, this line will cause the crash:

float DTreesImpl::predictTrees( const Range& range, const Mat& sample, int flags ) const
{
    ...
    if( predictType == PREDICT_MAX_VOTE ) {
    ...
        sum = (flags & RAW_OUTPUT) ? (float)best_idx : classLabels[best_idx]; // Line 1487
    ...
    }
}

bacause classLabels is empty (even if it's present in the xml file).

When using float labels, this line won't be executed, since predictType would be PREDICT_SUM instead of PREDICT_MAX_VOTE. (The relevant code is in the same function).

The cause for this is that the file is not loaded correctly (this may be a bug). In fact, when reading the file there is this check

void DTreesImpl::readParams( const FileNode& fn )
{
    ...
    int format = 0; // line 1720
    fn["format"] >> format;
    bool isLegacy = format < 3;
    ...
    if (isLegacy) { ... }
    else 
    {
        ...
        fn["class_labels"] >> classLabels;            
    }
}

but when writing the file, the field "format" is not there. So, you are in fact reading the file in the wrong format, because you enter the isLegacy part.


A workaround for this, is to save the file as:

...
std::vector<int> labels;
...
rtrees->write(cv::FileStorage("smoke_classifier.xml", cv::FileStorage::WRITE));
// Add this
{
    cv::FileStorage fs("smoke_classifier.xml", cv::FileStorage::APPEND);
    fs << "format" << 3; // So "isLegacy" return false;
}

cv::FileStorage read("smoke_classifier.xml",
                     cv::FileStorage::READ);
auto rtrees2 = cv::ml::RTrees::create();
rtrees2->read(read.root());

Doing so, the file will be loaded correctly, and the program won't crash.

Since I'm not able to reproduce your other problem in calcDir, let me know if this works.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...