Data Science Through a Causal Lens

AI That Knows Why

Understanding causation instead of mere correlation is one of the hot topics within the AI community. However, very few people understand how to merge causality with traditional data science and most of the research is focused on using causality to gain a better understanding of historic data, as opposed to building AI solutions that are driven by a causal approach. This is one of the key challenges that Geminos is working on. 

When building an AI solution using causality, we’re going to follow five key steps: 

  1. Build a causal model of the business as we want it to be. 
  2. Analyze our historical data to see how much of the causal model can be expressed without adding more data sources. If something is missing, we need to decide whether to try and find the needed data or reduce the scope of the solution. 
  3. Estimate or measure the probability of each causal path in the model and build functions that fulfill each of the cause-and-effect relationships. Some of these functions will be very simple and some will involve sophisticated ML or DL algorithms.  
  4. Build an AI driven solution that embeds the above functions.  
  5. Monitor the solution as it runs in our newly digitized business and feedback improvements to the model. 

A more detailed White Paper on the above process is available on request. 

The end result is a model of our digitalized business that is much easier to understand and implement, and also much more transparent (“AI That Knows Why”). In addition, parts of the model can be easily separated out and reused in other use cases that rely on the same cause-and-effect pairs of events. 

By driving data science from the causal model and taking this ‘divide and conquer’ approach, we move away from the monolithic algorithms traditionally used in AI: Simple causal functions are handled in standard code, leaving a set of smallersimpler and less monolithic AI algorithms to fill out the remaining causal relationships. 

Summary

Different communities have approached Pearl’s work on causality from a number of different angles, such as gaining a better understanding of data (e.g., in epidemiology), risk management, systems control and Artificial General Intelligence.  

At Geminos, we are 100% focused on bringing the benefits of Pearl’s work to the development of everyday AI-driven business solutions. To make that possible, our platform merges causality with traditional data science and includes low/no-code capabilities to ease the creation of finished solutions. 

The following screenshot shows a causal model in our toolsetOur tools are built into Node-RED, which is a widely used low-code environment. This allows us to embed the causal and knowledge models directly into the application architecture. The ‘UI Dashboard’ in the green box is optional and is shown here to illustrate the direct connection between the causal model and the UI. Look out for separate blogs detailing how all of this works.