Data Science Companion Module

No Extra Tools

Use built-in python powered TURBOARD technology. You won't need to carry out data science operations in other tools and move output data in excel sheets. This would be a burden and would require extra work to connect statistical output with your current data. Even integration with 3rd party tools would work just to ease this cumbersome processes a bit.

With TURBOARD data science abilities are packed in application. You don't need to know about powerful infrastructural tools of python for data analytics. You will just use your mouse in an apt UX and prepare the analyses.

Easy Start

No-hassle start for data science, no need to prepare data set for a separate tool. Use TURBOARD view and filters to prepare and clean data. Create samplings in TURBOARD View interface. Use powerful TURBOARD abilities to clean data and move forward to analyzing and modeling data in TURBOARD Analytics UX.

Connect Model to Live Data

When you come up with a model and apply that in TURBOARD, you apply that model not just to the sampling but to the whole set. So basically you use the same TURBOARD filters hand-in-hand with analytics powered expressions and mix analytics with your current business intelligence filters. As an example you may check the regression line in one region compare with whole country data or you may check in which branches your predictions are more accurate.

A General How to Use Guide

  1. Select a sampling
  2. Leave data science parameters default, you may play with them later on.
  3. Select a numeric target variable and source variables.
  4. A formula will be produced that will predict the target in most cases using variables.
  5. Check quality of analysis using specific tools provided per each different analysis.
  6. Auto-create a TURBOARD dashboard.

Machine Learning with Regression


In order to predict a target value from several preferably independent variables regression algorithms try to find a formula that match best in the sampling.

Imagine you know the data on taxi rides in a city, kilometers, minutes and fee for each ride. Regression will produce the best formula that will fit most of previous taxi rides. r2 is a quality metric for regression and tells percent of sampling that are perfectly defined by that predicted formula. If r2 is 0.9 that means the formula we found in regression defines 90% of taxi rides (some people may bargain for less or sometimes taxi drivers may keep the change).

Which Algorithms Can Be Used

In TURBOARD you can use linear and polynomial regression models.

Possible use Cases

  1. What will be the turnover when we change prices of a product and it's substitute/competitor products?
  2. Estimate output of a machine given all engineering parameters till now. Evaluate outperforming and lagging machines.
  3. Consider you have hundreds of projects and their different parameters, you may estimate another parameter using those given for future projects. Like you may guess total expenditure of the project, using number of people (with their assignments), geographical and timespan even can use ordinal data like severity of project, rigidity of terms.

What to Check for Best Results?

  1. Make sure you check r2(R-Squared) value in "Generated Expression" after each source parameter selection to make sure you selected most related parameters to predict the target variable. Higher that value better your prediction.

What will be The Default Output Visualization?

  1. It will produce scatter charts.
  2. You may want to use conditional formatting to deduce more on those scatter charts.

A sample end result video (from open data)

  1. Comparison of projects due their predicted and real parameters, detect

Forecasting with Classification


Given a target variable a classification algorithm will learn how data would decide that specific target variable and in new parameters will predict an output.

Which algorithms can be used

SVM and Logistic Regression are available algorithms that can be used without any single line of coding.

Possible Use Cases

  1. Churn analysis: Use your customer data, tag customers as lost and active. TURBOARD will produce the model result to predict whether or not a customer would be lost with current data. So you can take your actions before it's too late.
  2. Machine fail prediction: Use data of your previously failed machines and predict which machine is more prone to a fail in near future.

What to check for best results?

  1. Check the confusion matrix (a table showing prediction vs. real) that will be provided in playground and will be one of the default visualizations that will be created. Condensation on diagonal tells it's doing the job.
  2. Check ROC Curve to see the performance of classification model compared by different classification intervals.

What will be the default output visualization?

  1. Confusion matrices that can be used as slicer so that you can just click on matrix and filter results.
  2. Check with your eyes colors and groupings in polygonal regional colored scatter charts, see image in weather prediction below.
  3. Column charts showing groups of results by different categories and measures used.

Sample end result videos (from open data)

  1. Which weather is today look like?
  2. Bankruptcy prediction

Data Mining with Clustering


Given any number of parameters clustering will create groups of entities depending on their proximity

Which Algorithms Can Be Used

TURBOARD has Kmeans and Meanshift in it's pre-built statistical models repository

Possible Use Cases

  1. Group your customers and prepare relevantly targeted campaigns
  2. Groups of medicians due their tendencies performing c-sect operations, where they are located geographically in which types of hospitals. Filter by days of week and see changes in real online data.

What to check for best results?

  1. Just check the groups in resulting charts with respect to their placing
  2. Use groups as filters and check in a normal TURBOARD dashboard to see whether or not this grouping made sense.

What will be the default output visualization?

  1. It will create grouped polygonal scatter charts per duples of parameters. As there is no way to represent hyperspace in any way conceivable by human mind these cross-sections will serve as only means to illustrate model results.

Some End Results

  1. Iris

Finding The Hidden Patterns


  1. Network graph (not just dashlet type but analytics tool):
  2. P-hacker - Slice and dice data to find the best correlation between selected variables. Using that tool

Which Algorithms Can Be Used

  1. Network graph uses t-SNE or PCA
  2. p-hacking is a hybrid tool

Possible Use Cases

  1. You can find which of the items sold in which branches are most sensitive to price changes.

What to check for best results?

  1. On network graph t-SNE algorithm will neatly show resembling data near each other
  2. You can use TURBOARD to further add groups and detect anomalies

What will be the default output visualization?

  1. Network graph
  2. 2D representation on scatter
  3. 3D chart

Some end results

  1. Group FIFA 19 Football players
  2. Pharmaceutical companies grouped by their financial stability

Artificial Intelligence Powered What-If


Different with competitor BI What-If reports as TURBOARD can use statistical model within what-if, so a %20 increase in price will not boost turnover 20% as machine will predict there will be less sales.

Which algorithms can be used

  1. Linear or polynomial regression can be used.
  2. You can use parameters hand-in-hand with analytics powered expressions

Possible use cases

  1. Take into consideration substitute, complimentary product and competitor market prices and predict sales volume by simulated price changes. Maximize your profit with best pricing. Check out the video in Grocery Store Industry page Pricing tab
  2. Set your new year's budget playing with functional, economical and administrative budget categories. Target a next year's total budget, make compromises not after but before budget is set.

What to check for best results?

  1. Use patent protected cloning functionality of TURBOARD to see what-if changes and real data next to each other in columns.

What will be the default output visualization?

  1. What-if can be applied to any dashlet and there is no auto-generated dashlets for that analytics type

Some end results

  1. What-if those companies that are predicted go bankrupt, how many people will be unemployed in next year?
  2. What if you change price of organic avocado, what will be the sales volume of organic and conventional avocado?

To reveal striking insights buried in your own data discover