• 设为首页
  • 点击收藏
  • 手机版
    手机扫一扫访问
    迪恩网络手机版
  • 关注官方公众号
    微信扫一扫关注
    迪恩网络公众号

Java LogisticRegression类代码示例

原作者: [db:作者] 来自: [db:来源] 收藏 邀请

本文整理汇总了Java中org.apache.spark.ml.classification.LogisticRegression的典型用法代码示例。如果您正苦于以下问题:Java LogisticRegression类的具体用法?Java LogisticRegression怎么用?Java LogisticRegression使用的例子?那么恭喜您, 这里精选的类代码示例或许可以为您提供帮助。



LogisticRegression类属于org.apache.spark.ml.classification包,在下文中一共展示了LogisticRegression类的9个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于我们的系统推荐出更棒的Java代码示例。

示例1: testLogisticRegression

import org.apache.spark.ml.classification.LogisticRegression; //导入依赖的package包/类
@Test
public void testLogisticRegression() {
    //prepare data
    String datapath = "src/test/resources/binary_classification_test.libsvm";

    Dataset<Row> trainingData = spark.read().format("libsvm").load(datapath);

    //Train model in spark
    LogisticRegressionModel lrmodel = new LogisticRegression().fit(trainingData);

    //Export this model
    byte[] exportedModel = ModelExporter.export(lrmodel);

    //Import and get Transformer
    Transformer transformer = ModelImporter.importAndGetTransformer(exportedModel);

    //validate predictions
    List<LabeledPoint> testPoints = MLUtils.loadLibSVMFile(jsc.sc(), datapath).toJavaRDD().collect();
    for (LabeledPoint i : testPoints) {
        Vector v = i.features().asML();
        double actual = lrmodel.predict(v);

        Map<String, Object> data = new HashMap<String, Object>();
        data.put("features", v.toArray());
        transformer.transform(data);
        double predicted = (double) data.get("prediction");

        assertEquals(actual, predicted, 0.01);
    }
}
 
开发者ID:flipkart-incubator,项目名称:spark-transformers,代码行数:31,代码来源:LogisticRegression1BridgeTest.java


示例2: shouldExportAndImportCorrectly

import org.apache.spark.ml.classification.LogisticRegression; //导入依赖的package包/类
@Test
public void shouldExportAndImportCorrectly() {
    //prepare data
    String datapath = "src/test/resources/binary_classification_test.libsvm";

    Dataset<Row> trainingData = spark.read().format("libsvm").load(datapath);

    //Train model in spark
    LogisticRegressionModel lrmodel = new LogisticRegression().fit(trainingData);

    //Export this model
    byte[] exportedModel = ModelExporter.export(lrmodel);

    //Import it back
    LogisticRegressionModelInfo importedModel = (LogisticRegressionModelInfo) ModelImporter.importModelInfo(exportedModel);

    //check if they are exactly equal with respect to their fields
    //it maybe edge cases eg. order of elements in the list is changed
    assertEquals(lrmodel.intercept(), importedModel.getIntercept(), 0.01);
    assertEquals(lrmodel.numClasses(), importedModel.getNumClasses(), 0.01);
    assertEquals(lrmodel.numFeatures(), importedModel.getNumFeatures(), 0.01);
    assertEquals(lrmodel.getThreshold(), importedModel.getThreshold(), 0.01);
    for (int i = 0; i < importedModel.getNumFeatures(); i++)
        assertEquals(lrmodel.coefficients().toArray()[i], importedModel.getWeights()[i], 0.01);

    assertEquals(lrmodel.getFeaturesCol(), importedModel.getInputKeys().iterator().next());
    assertEquals(lrmodel.getPredictionCol(), importedModel.getOutputKeys().iterator().next());
}
 
开发者ID:flipkart-incubator,项目名称:spark-transformers,代码行数:29,代码来源:LogisticRegression1ExporterTest.java


示例3: testLogisticRegression

import org.apache.spark.ml.classification.LogisticRegression; //导入依赖的package包/类
@Test
public void testLogisticRegression() {
    //prepare data
    String datapath = "src/test/resources/binary_classification_test.libsvm";

    DataFrame trainingData = sqlContext.read().format("libsvm").load(datapath);

    //Train model in spark
    LogisticRegressionModel lrmodel = new LogisticRegression().fit(trainingData);

    //Export this model
    byte[] exportedModel = ModelExporter.export(lrmodel, trainingData);

    //Import and get Transformer
    Transformer transformer = ModelImporter.importAndGetTransformer(exportedModel);

    //validate predictions
    List<LabeledPoint> testPoints = MLUtils.loadLibSVMFile(sc.sc(), datapath).toJavaRDD().collect();
    for (LabeledPoint i : testPoints) {
        Vector v = i.features();
        double actual = lrmodel.predict(v);

        Map<String, Object> data = new HashMap<String, Object>();
        data.put("features", v.toArray());
        transformer.transform(data);
        double predicted = (double) data.get("prediction");

        assertEquals(actual, predicted, EPSILON);
    }
}
 
开发者ID:flipkart-incubator,项目名称:spark-transformers,代码行数:31,代码来源:LogisticRegression1BridgeTest.java


示例4: shouldExportAndImportCorrectly

import org.apache.spark.ml.classification.LogisticRegression; //导入依赖的package包/类
@Test
public void shouldExportAndImportCorrectly() {
    //prepare data
    String datapath = "src/test/resources/binary_classification_test.libsvm";

    DataFrame trainingData = sqlContext.read().format("libsvm").load(datapath);

    //Train model in spark
    LogisticRegressionModel lrmodel = new LogisticRegression().fit(trainingData);

    //Export this model
    byte[] exportedModel = ModelExporter.export(lrmodel, trainingData);

    //Import it back
    LogisticRegressionModelInfo importedModel = (LogisticRegressionModelInfo) ModelImporter.importModelInfo(exportedModel);

    //check if they are exactly equal with respect to their fields
    //it maybe edge cases eg. order of elements in the list is changed
    assertEquals(lrmodel.intercept(), importedModel.getIntercept(), EPSILON);
    assertEquals(lrmodel.numClasses(), importedModel.getNumClasses(), EPSILON);
    assertEquals(lrmodel.numFeatures(), importedModel.getNumFeatures(), EPSILON);
    assertEquals(lrmodel.getThreshold(), importedModel.getThreshold(), EPSILON);
    for (int i = 0; i < importedModel.getNumFeatures(); i++)
        assertEquals(lrmodel.weights().toArray()[i], importedModel.getWeights()[i], EPSILON);

    assertEquals(lrmodel.getFeaturesCol(), importedModel.getInputKeys().iterator().next());
    assertEquals(lrmodel.getPredictionCol(), importedModel.getOutputKeys().iterator().next());
}
 
开发者ID:flipkart-incubator,项目名称:spark-transformers,代码行数:29,代码来源:LogisticRegression1ExporterTest.java


示例5: main

import org.apache.spark.ml.classification.LogisticRegression; //导入依赖的package包/类
public static void main(String[] args) {
   SparkSession spark = SparkSession
     .builder().master("local").config("spark.sql.warehouse.dir", "file:///C:/Users/sumit.kumar/Downloads/bin/warehouse")
     .appName("JavaEstimatorTransformerParamExample")
     .getOrCreate();
   Logger rootLogger = LogManager.getRootLogger();
rootLogger.setLevel(Level.WARN);
   // $example on$
   // Prepare training data.
   List<Row> dataTraining = Arrays.asList(
       RowFactory.create(1.0, Vectors.dense(0.0, 1.1, 0.1)),
       RowFactory.create(0.0, Vectors.dense(2.0, 1.0, -1.0)),
       RowFactory.create(0.0, Vectors.dense(2.0, 1.3, 1.0)),
       RowFactory.create(1.0, Vectors.dense(0.0, 1.2, -0.5))
   );
   StructType schema = new StructType(new StructField[]{
       new StructField("label", DataTypes.DoubleType, false, Metadata.empty()),
       new StructField("features", new VectorUDT(), false, Metadata.empty())
   });
   Dataset<Row> training = spark.createDataFrame(dataTraining, schema);

   // Create a LogisticRegression instance. This instance is an Estimator.
   LogisticRegression lr = new LogisticRegression();
   // Print out the parameters, documentation, and any default values.
   System.out.println("LogisticRegression parameters:\n" + lr.explainParams() + "\n");

   // We may set parameters using setter methods.
   lr.setMaxIter(10).setRegParam(0.01);

   // Learn a LogisticRegression model. This uses the parameters stored in lr.
   LogisticRegressionModel model1 = lr.fit(training);
   // Since model1 is a Model (i.e., a Transformer produced by an Estimator),
   // we can view the parameters it used during fit().
   // This prints the parameter (name: value) pairs, where names are unique IDs for this
   // LogisticRegression instance.
   System.out.println("Model 1 was fit using parameters: " + model1.parent().extractParamMap());

   // We may alternatively specify parameters using a ParamMap.
   ParamMap paramMap = new ParamMap()
     .put(lr.maxIter().w(20))  // Specify 1 Param.
     .put(lr.maxIter(), 30)  // This overwrites the original maxIter.
     .put(lr.regParam().w(0.1), lr.threshold().w(0.55));  // Specify multiple Params.

   // One can also combine ParamMaps.
   ParamMap paramMap2 = new ParamMap()
     .put(lr.probabilityCol().w("myProbability"));  // Change output column name
   ParamMap paramMapCombined = paramMap.$plus$plus(paramMap2);

   // Now learn a new model using the paramMapCombined parameters.
   // paramMapCombined overrides all parameters set earlier via lr.set* methods.
   LogisticRegressionModel model2 = lr.fit(training, paramMapCombined);
   System.out.println("Model 2 was fit using parameters: " + model2.parent().extractParamMap());

   // Prepare test documents.
   List<Row> dataTest = Arrays.asList(
       RowFactory.create(1.0, Vectors.dense(-1.0, 1.5, 1.3)),
       RowFactory.create(0.0, Vectors.dense(3.0, 2.0, -0.1)),
       RowFactory.create(1.0, Vectors.dense(0.0, 2.2, -1.5))
   );
   Dataset<Row> test = spark.createDataFrame(dataTest, schema);

   // Make predictions on test documents using the Transformer.transform() method.
   // LogisticRegression.transform will only use the 'features' column.
   // Note that model2.transform() outputs a 'myProbability' column instead of the usual
   // 'probability' column since we renamed the lr.probabilityCol parameter previously.
   Dataset<Row> results = model2.transform(test);
   Dataset<Row> rows = results.select("features", "label", "myProbability", "prediction");
   for (Row r: rows.collectAsList()) {
     System.out.println("(" + r.get(0) + ", " + r.get(1) + ") -> prob=" + r.get(2)
       + ", prediction=" + r.get(3));
   }
   // $example off$

   spark.stop();
 }
 
开发者ID:PacktPublishing,项目名称:Apache-Spark-2x-for-Java-Developers,代码行数:76,代码来源:JavaEstimatorTransformerParamExample.java


示例6: train

import org.apache.spark.ml.classification.LogisticRegression; //导入依赖的package包/类
/**
 * Trains a whitespace classifier model and save the resulting pipeline model
 * to an external file. 
 * @param sentences a list of tokenized sentences.
 * @param pipelineModelFileName
 * @param numFeatures
 */
public void train(List<String> sentences, String pipelineModelFileName, int numFeatures) {
	List<WhitespaceContext> contexts = new ArrayList<WhitespaceContext>(sentences.size());
	int id = 0;
	for (String sentence : sentences) {
		sentence = sentence.trim();
		for (int j = 0; j < sentence.length(); j++) {
			char c = sentence.charAt(j);
			if (c == ' ' || c == '_') {
				WhitespaceContext context = new WhitespaceContext();
				context.setId(id++);
				context.setContext(extractContext(sentence, j));
				context.setLabel(c == ' ' ? 0d : 1d);
				contexts.add(context);
			}
		}
	}
	JavaRDD<WhitespaceContext> jrdd = jsc.parallelize(contexts);
	DataFrame df = sqlContext.createDataFrame(jrdd, WhitespaceContext.class);
	df.show(false);
	System.out.println("N = " + df.count());
	df.groupBy("label").count().show();
	
	org.apache.spark.ml.feature.Tokenizer tokenizer = new Tokenizer()
			.setInputCol("context").setOutputCol("words");
	HashingTF hashingTF = new HashingTF().setNumFeatures(numFeatures)
			.setInputCol(tokenizer.getOutputCol()).setOutputCol("features");
	LogisticRegression lr = new LogisticRegression().setMaxIter(100)
			.setRegParam(0.01);
	Pipeline pipeline = new Pipeline().setStages(new PipelineStage[] {
			tokenizer, hashingTF, lr });
	model = pipeline.fit(df);
	
	try {
		model.write().overwrite().save(pipelineModelFileName);
	} catch (IOException e) {
		e.printStackTrace();
	}
	
	DataFrame predictions = model.transform(df);
	predictions.show();
	MulticlassClassificationEvaluator evaluator = new MulticlassClassificationEvaluator().setMetricName("precision");
	double accuracy = evaluator.evaluate(predictions);
	System.out.println("training accuracy = " + accuracy);
	
	LogisticRegressionModel lrModel = (LogisticRegressionModel) model.stages()[2];
	LogisticRegressionTrainingSummary trainingSummary = lrModel.summary();
	double[] objectiveHistory = trainingSummary.objectiveHistory();
	System.out.println("#(iterations) = " + objectiveHistory.length);
	for (double lossPerIteration : objectiveHistory) {
	  System.out.println(lossPerIteration);
	}
	
}
 
开发者ID:phuonglh,项目名称:vn.vitk,代码行数:61,代码来源:WhitespaceClassifier.java


示例7: testPipeline

import org.apache.spark.ml.classification.LogisticRegression; //导入依赖的package包/类
@Test
public void testPipeline() {
    // Prepare training documents, which are labeled.
    StructType schema = createStructType(new StructField[]{
            createStructField("id", LongType, false),
            createStructField("text", StringType, false),
            createStructField("label", DoubleType, false)
    });
    Dataset<Row> trainingData = spark.createDataFrame(Arrays.asList(
            cr(0L, "a b c d e spark", 1.0),
            cr(1L, "b d", 0.0),
            cr(2L, "spark f g h", 1.0),
            cr(3L, "hadoop mapreduce", 0.0)
    ), schema);

    // Configure an ML pipeline, which consists of three stages: tokenizer, hashingTF, and LogisticRegression.
    RegexTokenizer tokenizer = new RegexTokenizer()
            .setInputCol("text")
            .setOutputCol("words")
            .setPattern("\\s")
            .setGaps(true)
            .setToLowercase(false);

    HashingTF hashingTF = new HashingTF()
            .setNumFeatures(1000)
            .setInputCol(tokenizer.getOutputCol())
            .setOutputCol("features");
    LogisticRegression lr = new LogisticRegression()
            .setMaxIter(10)
            .setRegParam(0.01);
    Pipeline pipeline = new Pipeline()
            .setStages(new PipelineStage[]{tokenizer, hashingTF, lr});

    // Fit the pipeline to training documents.
    PipelineModel sparkPipelineModel = pipeline.fit(trainingData);


    //Export this model
    byte[] exportedModel = ModelExporter.export(sparkPipelineModel);
    System.out.println(new String(exportedModel));

    //Import and get Transformer
    Transformer transformer = ModelImporter.importAndGetTransformer(exportedModel);

    //prepare test data
    StructType testSchema = createStructType(new StructField[]{
            createStructField("id", LongType, false),
            createStructField("text", StringType, false),
    });
    Dataset<Row> testData = spark.createDataFrame(Arrays.asList(
            cr(4L, "spark i j k"),
            cr(5L, "l m n"),
            cr(6L, "mapreduce spark"),
            cr(7L, "apache hadoop")
    ), testSchema);

    //verify that predictions for spark pipeline and exported pipeline are the same
    List<Row> predictions = sparkPipelineModel.transform(testData).select("id", "text", "probability", "prediction").collectAsList();
    for (Row r : predictions) {
        System.out.println(r);
        double sparkPipelineOp = r.getDouble(3);
        Map<String, Object> data = new HashMap<String, Object>();
        data.put("text", r.getString(1));
        transformer.transform(data);
        double exportedPipelineOp = (double) data.get("prediction");
        double exportedPipelineProb = (double) data.get("probability");
        assertEquals(sparkPipelineOp, exportedPipelineOp, 0.01);
    }
}
 
开发者ID:flipkart-incubator,项目名称:spark-transformers,代码行数:70,代码来源:PipelineBridgeTest.java


示例8: testPipeline

import org.apache.spark.ml.classification.LogisticRegression; //导入依赖的package包/类
@Test
public void testPipeline() {
    // Prepare training documents, which are labeled.
    StructType schema = createStructType(new StructField[]{
            createStructField("id", LongType, false),
            createStructField("text", StringType, false),
            createStructField("label", DoubleType, false)
    });
    DataFrame trainingData = sqlContext.createDataFrame(Arrays.asList(
            cr(0L, "a b c d e spark", 1.0),
            cr(1L, "b d", 0.0),
            cr(2L, "spark f g h", 1.0),
            cr(3L, "hadoop mapreduce", 0.0)
    ), schema);

    // Configure an ML pipeline, which consists of three stages: tokenizer, hashingTF, and LogisticRegression.
    RegexTokenizer tokenizer = new RegexTokenizer()
            .setInputCol("text")
            .setOutputCol("words")
            .setPattern("\\s")
            .setGaps(true)
            .setToLowercase(false);

    HashingTF hashingTF = new HashingTF()
            .setNumFeatures(1000)
            .setInputCol(tokenizer.getOutputCol())
            .setOutputCol("features");
    LogisticRegression lr = new LogisticRegression()
            .setMaxIter(10)
            .setRegParam(0.01);
    Pipeline pipeline = new Pipeline()
            .setStages(new PipelineStage[]{tokenizer, hashingTF, lr});

    // Fit the pipeline to training documents.
    PipelineModel sparkPipelineModel = pipeline.fit(trainingData);


    //Export this model
    byte[] exportedModel = ModelExporter.export(sparkPipelineModel, trainingData);
    System.out.println(new String(exportedModel));

    //Import and get Transformer
    Transformer transformer = ModelImporter.importAndGetTransformer(exportedModel);

    //prepare test data
    StructType testSchema = createStructType(new StructField[]{
            createStructField("id", LongType, false),
            createStructField("text", StringType, false),
    });
    DataFrame testData = sqlContext.createDataFrame(Arrays.asList(
            cr(4L, "spark i j k"),
            cr(5L, "l m n"),
            cr(6L, "mapreduce spark"),
            cr(7L, "apache hadoop")
    ), testSchema);

    //verify that predictions for spark pipeline and exported pipeline are the same
    Row[] predictions = sparkPipelineModel.transform(testData).select("id", "text", "probability", "prediction").collect();
    for (Row r : predictions) {
        System.out.println(r);
        double sparkPipelineOp = r.getDouble(3);
        Map<String, Object> data = new HashMap<String, Object>();
        data.put("text", r.getString(1));
        transformer.transform(data);
        double exportedPipelineOp = (double) data.get("prediction");
        double exportedPipelineProb = (double) data.get("probability");
        assertEquals(sparkPipelineOp, exportedPipelineOp, EPSILON);
    }
}
 
开发者ID:flipkart-incubator,项目名称:spark-transformers,代码行数:70,代码来源:PipelineBridgeTest.java


示例9: trainModel

import org.apache.spark.ml.classification.LogisticRegression; //导入依赖的package包/类
private static Transformer trainModel(SQLContext sqlContxt, DataFrame train, String tokenizerOutputCol, boolean useCV) {
		train = getCommonFeatures(sqlContxt, train, TOKENIZER_OUTPUT);
		
		VectorAssembler featuresForNorm = new VectorAssembler()
				.setInputCols(new String[] {"commonfeatures"})
				.setOutputCol("commonfeatures_norm");
		
		Normalizer norm = new Normalizer()
				.setInputCol(featuresForNorm.getOutputCol())
				.setOutputCol("norm_features");
		
		HashingTF hashingTF = new HashingTF()
				.setInputCol("ngrams")
				.setOutputCol("tf");
		
		IDF idf = new IDF()
				.setInputCol(hashingTF.getOutputCol())
				.setOutputCol("idf");
		
		// Learn a mapping from words to Vectors.
		Word2Vec word2Vec = new Word2Vec()
		  .setInputCol(tokenizerOutputCol)
		  .setOutputCol("w2v");
		
		List<String> assmeblerInput = new ArrayList<>();
			assmeblerInput.add("commonfeatures");
//			assmeblerInput.add(norm.getOutputCol());
//			assmeblerInput.add(idf.getOutputCol());
//			assmeblerInput.add(word2Vec.getOutputCol());
			assmeblerInput.add(W2V_DB);
		
		VectorAssembler assembler = new VectorAssembler()
				  .setInputCols(assmeblerInput.toArray(new String[assmeblerInput.size()]))
				  .setOutputCol("features");
		
		LogisticRegression lr = new LogisticRegression();
		
//		int[] layers = new int[] {108, 10, 10, 2};
//		// create the trainer and set its parameters
//		MultilayerPerceptronClassifier perceptron = new MultilayerPerceptronClassifier()
//		  .setLayers(layers)
//		  .setBlockSize(128)
//		  .setSeed(1234L)
//		  .setMaxIter(100);
//				.setRegParam(0.03);
//				.setElasticNetParam(0.3);
		
//			ngramTransformer, hashingTF, idf,
		PipelineStage[] pipelineStages = new PipelineStage[] {  /*hashingTF, idf,  word2Vec,*/  w2vModel, /*featuresForNorm, norm, */assembler, lr};
		Pipeline pipeline = new Pipeline()
				  .setStages(pipelineStages);
		
		stagesToString = ("commonfeatures_suff1x\t" + StringUtils.join(pipelineStages, "\t")).replaceAll("([A-Za-z]+)_[0-9A-Za-z]+", "$1");
					
		// We use a ParamGridBuilder to construct a grid of parameters to search over.
		// With 3 values for hashingTF.numFeatures and 2 values for lr.regParam,
		// this grid will have 3 x 2 = 6 parameter settings for CrossValidator to choose from.
		ParamMap[] paramGrid = new ParamGridBuilder()
//				.addGrid(word2Vec.vectorSize(), new int[] {100, 500})
//				.addGrid(word2Vec.minCount(), new int[] {2, 3, 4})
//				.addGrid(ngramTransformer.n(), new int[] {2, 3})
//				.addGrid(hashingTF.numFeatures(), new int[] {1000, 2000})
			.addGrid(lr.maxIter(), new int[] {10})
//		    .addGrid(lr.regParam(), new double[] {0.0, 0.1, 0.4, 0.8, 1, 3, 5, 10})
//		    .addGrid(lr.fitIntercept())
//		    .addGrid(lr.elasticNetParam(), new double[] {0.0, 0.2, 0.5, 0.8, 1.0} )
//			    .addGrid(idf.minDocFreq(), new int[]{2, 4})
		    .build();
		
		Transformer model;
		
		if (!useCV) {
			model = trainWithValidationSplit(train, pipeline, paramGrid);
		} else {
			model = trainWithCrossValidation(train, pipeline, paramGrid);
		}
		
		return model;
	}
 
开发者ID:mhardalov,项目名称:news-credibility,代码行数:80,代码来源:NewsCredibilityMain.java



注:本文中的org.apache.spark.ml.classification.LogisticRegression类示例整理自Github/MSDocs等源码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。


鲜花

握手

雷人

路过

鸡蛋
该文章已有0人参与评论

请发表评论

全部评论

专题导读
上一篇:
Java GlobalUndoableAction类代码示例发布时间:2022-05-15
下一篇:
Java OAuthServerConfiguration类代码示例发布时间:2022-05-15
热门推荐
阅读排行榜

扫描微信二维码

查看手机版网站

随时了解更新最新资讯

139-2527-9053

在线客服(服务时间 9:00~18:00)

在线QQ客服
地址:深圳市南山区西丽大学城创智工业园
电邮:jeky_zhao#qq.com
移动电话:139-2527-9053

Powered by 互联科技 X3.4© 2001-2213 极客世界.|Sitemap