Functionalities

Versioning

Every model and every script deployed on scoringduck.com has version. Version is non-empty string which might contain: digits, lowercase ASCII letters, underscores, hyphens and dots. If version is not explicitly given during deploying it will be generated. There is always exactly one version of given model/script which is default. Querying without specifying version has same effect as providing default one.

Model

Models are R language or Python-Scikit models which can be queried i.e. asked to make predictions for given data. For examples of building models see Tutorial.

Script

Scripts are written in python3.7-language and are useful when there is need to connect datasource(s) with model(s) or data altering beyond capability of single model or datasource is desired. Examples are using datasource output is used as model input and filtering datasource output using regular expression. Certain external python modules like numpy, pandas and scipy are provided. For interacting with models and other entities functions are provided.

Datasource

Datasources are used for retrieving and caching data.

External datasources

When datasource is queried it requests data from server, unless non-expired answer for identical query is present in cache. After response was received it is stored in cache for future use. When response is considering expired depend on datasource type. Most datasource types accepts cache_days argument in their output dictating how expireness of record in cache is determined. When cache_days value is negative everything in cache is considered to be non-expired. When cache_days value is non-negative all records in cache non-older than such number of days are considered to be non-expired. For example for cache_days being 1 only records from yesterday and today are non-expired. When deciding cache_days is influenced solely by date (i.e. hour, minute, second and so on are irrelevant) using scoringduck server time.

SQL_QUERY type datasources

SQL_QUERY-type datasource does request SQL server specified in UserDatabase. SQL statement in SQL_QUERY-type datasource might be either string or jinja2-template. In second case provided input is used for rendering before statement is executed. Table produced by statement is returned as list of dicts where keys of each dict are names of columns. If for given datasource onlyfirstrow is enabled single dict will be returned. SQL_QUERY-type datasources do not cache.

Query GUI

../_images/queryscreen.png

Query interface is common for models, scripts and datasources. Screenshot shows successful query of ipqscore external datasource. {"ip":"8.8.8.8"} was used as input data and information about given IP address were retrieved successfully. JSON is used both for input and output. Output might be also shown in table form, though not all entities support it.