Tutorial
========
Python
------
Model deployment from Python - an easy way Througout this tutorial we are going to build and deploy an Python-Scikit model for online scoring. So, it will be available for quering from other other languages and envornment like R, Java, JS or PHP.
Prerequisites
^^^^^^^^^^^^^
In order to start you need:
#. Python (version 3) with installed:
#. pandas
#. scikit-learn
#. account at ``_
Create Model
^^^^^^^^^^^^
A simple risk classification model will be built based on germancredit dataset::
import pandas as pd
from sklearn.pipeline import Pipeline
from sklearn.impute import SimpleImputer as Imputer
from sklearn import tree
from rtblib.ml import models
url="http://freakonometrics.free.fr/german_credit.csv"
df = pd.read_csv(url)
input_df = df[['Account Balance', 'Payment Status of Previous Credit', 'Purpose', 'Length of current employment', 'Sex & Marital Status']].copy()
target = df['Creditability'].copy()
tree_opts = {'min_samples_leaf': 30, 'max_features': None, 'max_depth': 10}
clf = Pipeline([("imputer", Imputer(strategy="mean")), ("model", tree.DecisionTreeClassifier(**tree_opts))])
clf = clf.fit(input_df, target)
mod = models.ModelScikit(clf, list(input_df.columns), 'germancredittree', '1.0', model_type='CLASSIFICATION')
models.save('german_credit_tree.mod', mod)
Deploy Model
^^^^^^^^^^^^
If you haven't created account at scoringduck.com, it's the time to do it here ``_ Now install the Python client::
pip install http://app.scoringduck.com/static/jaga_client-0.0.1-py3-none-any.whl
last lines of output should be similar to::
Installing collected packages: jaga-client
Successfully installed jaga-client-0.0.1
Connect to server and check it works: Log in to your account at app.scoringduck.com, find your token - it is used as third argument during JagaClient object creation::
from jagaclient import JagaClient
jg = JagaClient("http://app.scoringduck.com/","YOUR_USERNAME","YOUR_API_KEY")
ret_list = jg.list_models()
print(ret_list)
Output is similar to::
[{'name': 'german_credit_logit', 'version': '1', 'modeltype': 'R', 'uploaddatetime': '2020-12-18 11:27:01', 'isdefault': True, 'isarchived': False, 'publicaccess': False}]
There is single model (named german_credit_logit) provided by default during account creation, deploy previously created model::
result = jg.deploy("germancredittree", "1", "german_credit_tree.mod")
print(result)
Output should be similar to::
{'model': 'germancredittree', 'version': '1', 'msg': 'New version of model germancredittree 1 deployed', 'elapsed': 0.10997330000100192}
deploy raises RuntimeError if something went wrong Check it's there::
ret_list = jg.list_models()
print(ret_list)
Output is similar to::
[{'name': 'germancredittree', 'version': '1', 'modeltype': 'Scikit', 'uploaddatetime': '2020-12-18 13:36:05', 'isdefault': True, 'isarchived': False, 'publicaccess': False}, {'name': 'german_credit_logit', 'version': '1', 'modeltype': 'R', 'uploaddatetime': '2020-12-18 11:27:01', 'isdefault': True, 'isarchived': False, 'publicaccess': False}]
You have just deployed the model to the scoringduck scoring engine. Now it's available for querying from various tools/environments including Python.
Score data
^^^^^^^^^^
In order to score data just do the following: Let's predict using following data from one of rows of credit:
+-----------------------------------+---+
| Account Balance | 4 |
+-----------------------------------+---+
| Payment Status of Previous Credit | 3 |
+-----------------------------------+---+
| Purpose | 3 |
+-----------------------------------+---+
| Length of current employment | 4 |
+-----------------------------------+---+
| Sex & Marital Status | 3 |
+-----------------------------------+---+
Python client accepts dictionary as data input::
to_score = {"Account Balance": 4, "Payment Status of Previous Credit": 3, "Purpose": 3, "Length of current employment": 4, "Sex & Marital Status": 3}
result = jg.score("germancredittree", "1", to_score)
print(result)
Output is::
{'model': 'germancredittree', 'version': '1', 'result': {'res': 0.9545454545454546}, 'elapsed': 0.013599499998235842}
Predicted value is named res Easy right ? The same way one can use the model from other languages/environments. Refer to R tutorial for information regarding installation and scoring using R client
R
-
Based on
* ``_
* ``_
Model deployment from R - an easy way Througout this tutorial we are going to build and deploy an R model for online scoring. So, it will be available for quering from other other languages and envornment like Python, Java, JS or PHP.
Prerequisites
^^^^^^^^^^^^^
In order to start you need two things:
#. R
#. account at ``_
Create Model
^^^^^^^^^^^^
A simple risk classification model will be built based on germancredit dataset::
url="http://freakonometrics.free.fr/german_credit.csv"
credit=read.csv(url, header = TRUE, sep = ",")
i_test=sample(1:nrow(credit),size=333)
i_calibration=(1:nrow(credit))[-i_test]
logistic_model <- glm(Creditability ~ Account.Balance + Payment.Status.of.Previous.Credit + Purpose + Length.of.current.employment + Sex...Marital.Status, family=binomial, data = credit[i_calibration,])
saveRDS(logistic_model,"german_credit_logit.rds")
Deploy Model
^^^^^^^^^^^^
If you haven't created account at scoringduck.com, it's the time to do it here ``_ Now install the R client::
install.packages(pkgs="http://app.scoringduck.com/static/jagaclient_0.1.0.tar.gz",repos=NULL,type="source")
Output should be similar to::
* installing *source* package 'jagaclient' ...
** using staged installation
** R
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
converting help for package 'jagaclient'
finding HTML links ... wykonano
connect html
deploy html
hello html
list_models html
score html
** building package indices
** testing if installed package can be loaded from temporary location
*** arch - i386
*** arch - x64
** testing if installed package can be loaded from final location
*** arch - i386
*** arch - x64
** testing if installed package keeps a record of temporary installation path
* DONE (jagaclient)
Check installation - following command should print TRUE::
print("jagaclient" %in% rownames(installed.packages()))
Connect to server and check it works: Log in to your account at app.scoringduck.com, find your token - it is used as second argument of connect::
library("jagaclient")
conndata <- jagaclient::connect("YOUR_USERNAME","YOUR_API_KEY","http://app.scoringduck.com/")
ret_list <- jagaclient::list_models(conndata)
print(ret_list)
Output is similar to::
name version modeltype uploaddatetime isdefault isarchived publicaccess
1 german_credit_logit 1 R 2020-12-18 11:27:01 TRUE FALSE FALSE
There is single model (named german_credit_logit) provided by default during account creation, which is same as one you created. If it would not be provided you would have to deploy it following way::
result <- jagaclient::deploy(conndata, "german_credit_logit", "1", "german_credit_logit.rds")
print(result)
deploy returns TRUE if everything went correctly. Models are available for querying from various tools/environments including R.
Score data
^^^^^^^^^^
In order to score data just do the following: Let's predict using following data from one of rows of credit:
+-----------------------------------+---+
| Account.Balance | 4 |
+-----------------------------------+---+
| Payment.Status.of.Previous.Credit | 3 |
+-----------------------------------+---+
| Purpose | 3 |
+-----------------------------------+---+
| Length.of.current.employment | 4 |
+-----------------------------------+---+
| Sex...Marital.Status | 3 |
+-----------------------------------+---+
Server accepts data in JSON format - use jsonlite to create it::
library(jsonlite)
to_score <- list("Account.Balance"=jsonlite::unbox(4), "Payment.Status.of.Previous.Credit"=jsonlite::unbox(3), "Purpose"=jsonlite::unbox(3), "Length.of.current.employment"=jsonlite::unbox(4), "Sex...Marital.Status"=jsonlite::unbox(3))
args <- jsonlite::toJSON(to_score)
print(args)
Output is::
{"Account.Balance":4,"Payment.Status.of.Previous.Credit":3,"Purpose":3,"Length.of.current.employment":4,"Sex...Marital.Status":3}
Score it::
result <- jagaclient::score(conndata, "german_credit_logit", "1", args)
print(result)
Output should be similar to::
$model
[1] "german_credit_logit"
$version
[1] "1"
$result
$result$Account.Balance
[1] 4
$result$Payment.Status.of.Previous.Credit
[1] 3
$result$Purpose
[1] 3
$result$Length.of.current.employment
[1] 4
$result$Sex...Marital.Status
[1] 3
$result$res
[1] 0.9053125
$elapsed
[1] 0.06597743
Predicted value is named res Easy right ? The same way one can use the model from other languages/environments. For example you can score data sent from Python script::
from jagaclient import JagaClient
jg = JagaClient("http://app.scoringduck.com/", "YOUR_USERNAME", "YOUR_API_KEY")
args = {"Account.Balance":4,"Payment.Status.of.Previous.Credit":3,"Purpose":3,"Length.of.current.employment":4,"Sex...Marital.Status":3}
result = jg.score("german_credit_logit", "1", args)
print(result)
Output should be similar to::
{'model': 'german_credit_logit', 'version': '1', 'result': {'Account.Balance': 4, 'Payment.Status.of.Previous.Credit': 3, 'Purpose': 3, 'Length.of.current.employment': 4, 'Sex...Marital.Status': 3, 'res': 0.9053124547915977}, 'elapsed': 0.018342145998758497}
PMML
----
ScoringDuck does support *Predictive Model Markup Language*, therefore any tool for models creating, which is able to export model to said format might be used. For example sake model described in R section will be used.
Prerequisites
^^^^^^^^^^^^^
In order to start you need following things:
#. R with installed and working ``_
#. account at ``_
Create Model
^^^^^^^^^^^^
A simple risk classification model will be built based on germancredit dataset::
library("r2pmml")
url="http://freakonometrics.free.fr/german_credit.csv"
credit=read.csv(url, header = TRUE, sep = ",")
credit$Creditability=factor(credit$Creditability)
i_test=sample(1:nrow(credit),size=333)
i_calibration=(1:nrow(credit))[-i_test]
logistic_model <- glm(Creditability ~ Account.Balance + Payment.Status.of.Previous.Credit + Purpose + Length.of.current.employment + Sex...Marital.Status, family=binomial, data = credit[i_calibration,])
r2pmml(logistic_model,"german_credit_logit.pmml")
Note that unlike in ``R`` example ``Creditability`` is converted from numeric to factor. Failure to observe this step will result in ``r2pmml`` failure.
Deploy Model
^^^^^^^^^^^^
If you already have pmml file with model you can deploy it without installing any additional software. Just log in ``_ then click *Deploy models* and use form. Clients too might be used for deploying in same way like for theirs format (for information regarding client installation refer to relevant piece of Python section or R section). R client usage in this case is as follows::
library("jagaclient")
conndata <- jagaclient::connect("YOUR_USERNAME","YOUR_API_KEY","http://app.scoringduck.com/")
result <- jagaclient::deploy(conndata, "german_credit_logit", "2", "german_credit_logit.pmml")
print(result)
Python client usage in this case is as follows::
from jagaclient import JagaClient
jg = JagaClient("http://app.scoringduck.com/","YOUR_USERNAME","YOUR_API_KEY")
print(ret_list)
result = jg.deploy("german_credit_logit", "2", "german_credit_logit.pmml")
print(result)
Remember to specify version which does not exist so far. Here ``2`` is used to avoid conflict with model described in R section.
Score data
^^^^^^^^^^
It is possible to score without installing any additional software. Just log in ``_ then click *Manage* and then *Query* desired model. Clients too might be used for scoring, R client usage example::
library(jsonlite)
to_score <- list("Account.Balance"=jsonlite::unbox(4), "Payment.Status.of.Previous.Credit"=jsonlite::unbox(3), "Purpose"=jsonlite::unbox(3), "Length.of.current.employment"=jsonlite::unbox(4), "Sex...Marital.Status"=jsonlite::unbox(3))
result <- jagaclient::score(conndata, "german_credit_logit", "2", args)
print(result)
Output should be similar to::
$model
[1] "german_credit_logit"
$version
[1] "2"
$result
$result$`probability(0)`
[1] 0.08665164
$result$`probability(1)`
[1] 0.9133484
$elapsed
[1] 0.5412666
Python client usage example::
from jagaclient import JagaClient
jg = JagaClient("http://app.scoringduck.com/","YOUR_USERNAME","YOUR_API_KEY")
args = {"Account.Balance":4,"Payment.Status.of.Previous.Credit":3,"Purpose":3,"Length.of.current.employment":4,"Sex...Marital.Status":3}
result = jg.score("german_credit_logit","2",args)
print(result)
Output should be similar to::
{'model': 'german_credit_logit', 'version': '2', 'result': {'probability(0)': 0.0866516359066355, 'probability(1)': 0.9133483640933645}, 'elapsed': 0.6012406999998348}
Remember to specify same version as during Deploy Model. For more detailed explanation of clients usage refer to *Score data* in Python section or R section.
Or directly from bash console: TODO: bash script here