Data Driven
Quantitative Approach to the Business
C&I's team has expertise and experience in: indices construction, macro/micro forecasting and now-casting, design of quantitative scorecards and credit rating systems, behavioural market segmentation, process optimisation, CRM, stress-testing, pharmacoeconomics, health assessment, football performance (e.g., agricultural optimisation, digital media, TV ratings systems, fan analytics, audience estimations, TV ratings, etc.

All our quantitative products are data driven, meaning that our methodologies, techniques and approach reveals hidden patterns in large datasets and creates algorithms that extract useful information. This way we’re moving from the concept of data value to the concept of information value.

Football Quantitative Solutions
Data driven football.
Football player transfer value
Quantifying the impact of individual player characteristic on the target estimation value.
Fair Transfer Value
Knowing the Value
Our idea is to provide formula that impartially evaluates football player value, i.e. fair transfer value at any time, regardless of transfer windows. Fair transfer value estimation is based on players performance.

Models aim to quantify the impact of individual player characteristic on the target estimation value.

The methodology is based on the idea that player's characteristics attract buyers, and therefore, to estimate each player value it is necessary to determine the prices of these individual characteristics, i.e. their implicit prices, but also how their particular combination affects valuation.

Player & Team Indicators
Quantify the impact of individual player characteristic on the target estimation value.
Quantifying the performance
Performance Indicators
Idea is to observe players not as a individual with name and surname, but as the individual comprised of certain set of performance indicators values. For performance indicators we use combination of several quantitative methods.

We can combine expert knowledge of football professionals and quantify performance measures that experts find useful. Player indices can be extended to “team indices” and this is opening the path to the measures of the team cohesion. Our further research on transfer value and player indices led to method of calculation of team winning potential, that takes into account particular formation and lineups of both teams. Our methodology can be perceived as the valuable know-how that can produce customized indices for football decision support systems.

Player Injury risk
Quantitative injury risk assessment and management.
Minimizing the risks
Injury Risk Management
Main idea is to provide experts with measures of injury risk assessment so they can manage it and put a risk at a acceptable level. Our techniques and methodologies can trace correlations between play and injury risk of a player.

There are 3 main question that we want to provide answer for:

What is the risk for the player towards specific injury?

How long will recovery of the injured player last?

How long will it take for the player to reach pre-injury performance level?

All calculations are completely data driven. We use a combination of methodologies like regression analysis, neural networks, clustering...

Team Win Potential
Quantify the impact of individual player characteristic on the target estimation value.
Its about the team not the names
Creating the optimal team
Idea is to observe player not as a original individual with name and surname, but as the individual comprised of certain set of performance indicators values. For performance indicators we use combination of several quantitative methods.

We can combine expert knowledge of football professionals an quantify performance measures that experts find useful.

To estimate match outcome probabilities, we use specific set of quantitative models on a historical performance data. Model output is probability of "Home win", "Draw/Tie" and "Away win". Our team winning potential takes into account particular formation and lineups of both teams. As team is influenced by the set of individual player characteristics, this is opening the path to the measures of the team cohesion. Results can be used in scouting efforts, as they point to the optimal characteristics that team could benefit form and player with the desirable set of the characteristics can be searched for.

Audience estimation and TV rating
Powerfull media tools.
Audience estimation and TV ratings
C&I's team developed, for our respected media customers, system for estimation of TV Ratings targeted on non-metered markets. System was focused on football and motorsports TV content.
Budget impact studies and health assesment.
Pharmacoeconomic study
Our team worked on the calculations supporting Pharmacoeconomic studies. In included budget and health impact simulation for the case of the introduction of the new drug on the market.
Vineyard production optimisation.
Vineyard production optimisation
Agrimatica is quantitative decision support system for wine making industry.

Idea is to simultaneously optimise vineyard processes and vinery grape processing in order to get more control over logistics problems in both parts of vine production. Goal is to enable quantitatively gained decision support. Need for such system arises from the complexity of the logistics in the wine making industry – short timeframe for harvest and grape processing in order to control quality of product. The system is tended for large and technologically developed production systems that are in conjunction with vineyard management. Small producers are of special interest. They could use certain features of our DSS, in order to improve our overall concept of wine-making branch. As an experts for mathematical modelling, we have developed a COS (core optimisation system) prototype, containing quantitative models for process optimization in wineries (including all technological steps and equipment). Further development of COS will enable grape harvest scheduling (with quantitative prediction of the ripening and quality) and the system of assessment and control of production results.

System for car plates recognition
Computer vision.
Computer vision
System for car plates recognition
C&I has experience and can offer the computer vision solutions. Our system, developed on advanced machine learning methods, successfully recognizes car plates from a video of moving vehicles.
Real Estate Market Analysis
Complete quantitiative solutions for real estate market analysis.
Real Estate Market Analysis

The construction of an index that tracks the dynamics of the real estate market on a regular basis is a complex task. Common statistical approach to monitoring developments in real estate prices uses indexes based on average prices per square meter of living space. We offer different methodology. To successfully monitor the dynamics of real estate prices there is a need to eliminate bias in the data. Reliable real estate price index should reflect the change of "pure" real estate prices, regardless of the number of extremely high quality or low quality units sold in a period of time. Technical framework that enables the construction of a reliable index, which takes into account the characteristics of the real estate, is based on multivariate hedonic regressions [1]. The methodology is based on the idea that real estate characteristics are those features that attract buyers, and therefore, to evaluate each real estate it is necessary to determine the prices of these characteristics (attributes), i.e. their implicit prices.

Outlined methodology is used for creation of an expert system for accurate estimation of the market price for a given real estate, provided its characteristics. Each characteristic has determined value (individual implicit price) and those values are employed to produce the price of the real estate. The system can precisely determine the price of the real estate even if the available database does not contain real estate with same characteristics. An end-user can use the system via a simple spreadsheet (e.g. Excel) or similar simple applications. Also, it reduces number of personnel needed for real estate estimation. Simplicity of use and preciseness makes this product a favourable tool of every analyst.

Knowing customer preferences is important. By comparing the implicit price of individual characteristics, the structure of customer preferences is examined in detail. We can, for example, answer the questions: "how important, to the buyer, is the fact that the apartment is located in a particular area in town or how much buyers care for type of housing (apartment building, apartment house, etc.)?"

Our methodology may be used to estimate the equilibrium level of property prices in the market. At the micro level, it is possible to calculate whether the price of a real estate is above or below the expected price. At the macro level (i.e. at the level of the total market), on a regular basis it is possible to assess whether the market is "overheated", i.e. whether the price level is above that determined the economic fundamentals (GDP, general level of inflation, etc.).

In case that the available database contains information on ask and actual property prices, it is possible to determine an unbiased index of bid-ask spread for the real estate market. This index is often a useful indicator of the change in real estate prices in the near future.

Powerfull strategic tools.
Reliable forecasts of key variables are nowadays integral part of most decision making processes in various businesses. The reason for that stems from the fact that, for example, quality of firms' demand, sales or cash-flow forecast may have significant impact on future pricing decision, investment planning or future capacity requirements of a firm. Moreover, in many businesses some important decisions may materialize only with a considerable time lag, and therefore are grounded mostly on available forecast.

Traditionally, in many firms forecast are compiled internally and are produced relying on various ad-hoc methods. In other words, they are based on forecasters' judgment on underlying developments of interest. This approach however makes sense in very few cases only, for example, when forecaster is not biased, he/she understands and hopefully is able to quantify relations between key variables very well or possibly has some privileged information about underlying variables to be forecasted. Otherwise, judgmental methods are not better in forecasting than "crystal ball" guessing approach.

In many cases, more reliable forecasts are produced by formal methods based on econometric, economic and statistical theory. Instead of guessing the future developments based purely on individuals' judgment, these methods summarize the historical information captured by many (even hundreds or thousands) variables and exploit this information in order to make predictions about target variable in future. Moreover, these methods are suitable to make inference on and quantify the relations between various variables of interest.

Methodology needed to make such an inference includes models such as:

  1. Linear regression models, SARIMA models
  2. Exponential smoothing models, system of equations
  3. Standard Vector Autoregressions (VAR), Bayesian VARs, various seasonal capturing models
  4. Monte Carlo simulation based models, Markov Switching methods
  5. Dynamic Factor models.

With many of these models it is possible to incorporate "judgmentally" constructed forecasts into those produced by models in order to achieve more reliable results.

Finally, taking into account the fact that every forecast is done with certain error, one must develop a framework under which the error is clearly presented and well understood. For that purpose it is possible to present the forecast in the so called "fan chart " (Figure 1). Beside the forecast this type of graph also presents various probabilities for the forecast to realize in the future.

Optimization of Business Processes
The Way to maximise efficiency.
Quantitative scoring
Automatic classification of business subjects according to their risk profile.
Quantitative scorings
Quantitative scoring is a methodology for automatic classification of business subjects according to their risk profile. Methodology is widely applicable. Quantitative Scoring methodology characteristics:
  • Precise, fast and cost efficient analysis;
  • Highest efficiency in using complete information held in database in use;
  • Planning of desirable risk exposure;
Quantitative Scoring advantages compared to other methods:
  • Impartial approach to variables, risk factors and business subjects;
  • Small number of analysts needed;
  • Gives a clear numerical measure of risk;
Quantitative scoring goal is to give user exact measure of observed business subject risk level, so user can adjust risk exposure, but doesn't discard desirable business subjects (clients). Scorecards are made through the quantitative mathematical analysis (based on regression modelling) of vast number of variables in observed database and here emerges one of advantages of scorecards compared to classical expert (qualitative) analysis – scorecard model can observe correlations between all variables in database and it does it quickly, compared to analyst who mostly focuses on only few variables. Mathematics behind scorecard is complex, but produced scorecard is simple. It will extract small number of variables, but it will preserve all correlations that existed in database and it will preserve whole information that database holds. Chosen variables, with associated score, will precisely describe clients risk profile. Scorecard will associate cumulative score to client (or product), and produce single and simple number that measures probability of default. Default case is a target case for client or product, that analysis seeks, e.g. probability that client won't pay due-for-payment loan annuities in following period. Additionally, according to probability of default information, rating system can be established. Clients will be aligned according to their probability of default and it enables the choice of threshold behind which clients are accepted. The threshold depends on risk appetite. End-user of scorecard will be able to have clear measure of risk level of client and will be able to see if client is acceptable, by the pre-determined company risk exposure threshold. Precision of scorecard analysis can be checked through back testing.
Churn Modelling
Powerfull strategic tools.
Market (Behavioural) Segmentation
Find new ways to increase customer satisfaction.
Market (Behavioural) Segmentation
Market segmentations offer novel way to review market characteristics, conditions and dynamics. It is often described as "inhomogeneous" approach. Using market segmentation makes easier to find new ways to increase customer satisfaction and loyalty, or improve customer service models.

Behavioural Segmentation

Behavioural segmentation is sophisticated way of compressing information about customers behaviour, which makes it valuable strategic decision support tool. Essentially, it is a list of relevant types of customer behaviour, manifested in some period. Behavioural segmentation stands out when compared to other market segmentations in the following ways:
  • It is the only one built by consistent observation of the whole range of aspects that define customer's interaction with the company;
  • It successfully replaces a whole range of other partial decision support solutions such as "Propensity to Buy" (PtB) models, Customer Attrition models or Activation models, thus significantly reducing the costs of development, implementation and maintenance of these models;
  • Behavioural segmentation, unlike LTV * segmentation, doesn't rely on any assumptions regarding future "Business to Consumer" (B2C) interaction, so it is far more robust and reliable;
Behavioural segmentation groups customers into segments according to their similarities in interaction with the company. In example taken from retail banking interaction could be observed through following aspects:
  • Client's level, structure or change of: debt, savings, consumption, income, activity,...;
  • Risk assessment;
  • Demography (age, education, profession...);

Interaction is measured within fixed historical observation period, whose length is determined by the nature of company business as well as by predetermined role of segmentation usage. Segments are usually named by some dominant behavioural feature which is subsequently connected ("coloured") with other features in order to obtain full picture, so in previous example from retail banking segments could be high consumers, Highly Indebted Customers, High LOC Utilization Customers, debt expiration customers, etc.

After grouping customers into segments (according to their similarities in interaction with the company), assumption of similarity in customer segment behaviour in recent past and customer behaviour in the future can be applied, i.e. there is expectation that certain group (segment) will show similar behaviour dynamic as it has shown in the recent past.

Behavioural segmentation development is a very complex process, but final product is very simple to use. When it comes to data requirements, it is not necessary that the data goes far into the past, but information stored in database should allow derivation of wide range of aspects of customer behaviour. The basic data mining technique relied on is cluster analysis, supported by PCA analysis and correlation analysis.

Technical definition of segments, in terms of defining aspects is contained in a very short and simple algorithm. So, unlike the development process, technical implementation of behavioural segmentation is an easy task. The entire database can be segmented any time by applying simple segmentation rules.

The development of behavioural segmentation is more time-consuming when compared to development of individual (partial) solution (e.g. various PtB models, attrition models...). However, a well-developed behavioural segmentation can not only effectively replace all of those tools together, but also it gives a completely new and improved picture of portfolio of clients.

Beside behaviour properties, behavioural segmentation also allows other non-behavioural properties to be assigned to recognized patterns of behaviour. This is contrary to the principle of intuition based "simple segmentation". Furthermore, in intuitive approach, correlations and multi-collinearity of attributes describing clients are not tested and properly treated, leading to a false image of the client portfolio.

Through comprehensive and unbiased "data driven" approach, behavioural segmentation will discover hidden interrelations between client groups.

* LTV segmentation is based on an assessment of the total profit that the customer will generate during relationship with the company. Therefore, it is necessary to assess: (a) attrition time (even a death...), (b) development through the time of the customer basket of products and/or services, (c) development through the time of the profitability of the client's basket of products and/or services. In other words, LTV is based on prediction of the cash flow generated by each client, not in a selected fixed future period, but to (again predicted) end of the relationship between the customer and the firm. Customers are then grouped into segments according to the forecasted profitability. However, it is obvious that such a large number of complex forecasts of various aspects concerning customer interaction with the company can't work, especially not together.