LogisticRegressionWithLBFGS#
- class pyspark.mllib.classification.LogisticRegressionWithLBFGS[source]#
- Train a classification model for Multinomial/Binary Logistic Regression using Limited-memory BFGS. - Standard feature scaling and L2 regularization are used by default. .. versionadded:: 1.2.0 - Methods - train(data[, iterations, initialWeights, ...])- Train a logistic regression model on the given data. - Methods Documentation - classmethod train(data, iterations=100, initialWeights=None, regParam=0.0, regType='l2', intercept=False, corrections=10, tolerance=1e-06, validateData=True, numClasses=2)[source]#
- Train a logistic regression model on the given data. - New in version 1.2.0. - Parameters
- datapyspark.RDD
- The training data, an RDD of - pyspark.mllib.regression.LabeledPoint.
- iterationsint, optional
- The number of iterations. (default: 100) 
- initialWeightspyspark.mllib.linalg.Vectoror convertible, optional
- The initial weights. (default: None) 
- regParamfloat, optional
- The regularizer parameter. (default: 0.01) 
- regTypestr, optional
- The type of regularizer used for training our model. Supported values: - “l1” for using L1 regularization 
- “l2” for using L2 regularization (default) 
- None for no regularization 
 
- interceptbool, optional
- Boolean parameter which indicates the use or not of the augmented representation for training data (i.e., whether bias features are activated or not). (default: False) 
- correctionsint, optional
- The number of corrections used in the LBFGS update. If a known updater is used for binary classification, it calls the ml implementation and this parameter will have no effect. (default: 10) 
- tolerancefloat, optional
- The convergence tolerance of iterations for L-BFGS. (default: 1e-6) 
- validateDatabool, optional
- Boolean parameter which indicates if the algorithm should validate data before training. (default: True) 
- numClassesint, optional
- The number of classes (i.e., outcomes) a label can take in Multinomial Logistic Regression. (default: 2) 
 
- data
 - Examples - >>> data = [ ... LabeledPoint(0.0, [0.0, 1.0]), ... LabeledPoint(1.0, [1.0, 0.0]), ... ] >>> lrm = LogisticRegressionWithLBFGS.train(sc.parallelize(data), iterations=10) >>> lrm.predict([1.0, 0.0]) 1 >>> lrm.predict([0.0, 1.0]) 0