Praedictio : Machine Learning
Praedictio is a business
predictions framework that provides powerful predictive analytics by analysing
the business data scattered across different repositories in an organization.
Praedictio framework will enable developers and data scientists to integrate
data driven ML models to business applications quickly and easily with powerful
tools to aggregate data, do data modeling, training and deploying.
Praedictio can run on-premise or
on any cloud platform and serve highly accurate business predictions that will
enable the business owners and decision makers to make timely decisions on
Machine Learning is growing it’s
popularity in a wide spectrum of business domains to cater the need of
providing customer focused, accurate and robust business insights. One of the
biggest challenges in creating and maintaining a Machine Learning based
prediction system is orchestrating the Model Creation, Learning, Model
Validation and Deployment and Infrastructure Maintenance in Production
environment. With the high volatility of data and improved
learning models deploying fresh models become trickier. Most machine learning frameworks and systems
only address model training or
deployment and connectivity between different components is done ad hoc
via glue code or custom scripts. Praedictio integrates the aforementioned compoents
into one platform simplifying the platform configuration and reducing time to
production while increasing scalability.
Praedictio introduces a modular
architecture to simplify model development and deployment across frameworks and
applications. Furthermore, by introducing caching, batching, and adaptive model
selection techniques, Praedictio reduces prediction latency and improves
prediction throughput, accuracy, and robustness without modifying the
underlying machine learning frameworks. The platform also can be Integrated
with enterprise systems, while satisfying stringent data security, privacy, or
Platform Design and
The Praedictio design adopts the
One machine learning platform for many learning tasks.
There is a large and growing
number of machine learning frameworks. Each framework has strengths and
weaknesses and many are optimized for specific models or application domains
(e.g., computer vision). Thus, there is no dominant framework and often
multiple frameworks may be used for a single application. In a situation where training data grows
requirement arises for a framework with distributed training leading to change
of frameworks once selected as the best available in Machine Learning. Even though common model exchange formats had
been introduced in the past due to the rapid technological advancements and
fact that additional errors arising from parallel implementations for training
and serving these common message formats didn’t gain popularity.
We chose to use TensorFlow and
Scikit Learn as the trainer but the platform design is not limited to these
One factor in choosing (or dismissing) a machine learning platform is its
coverage of existing algorithms. Scikit holds a wide variety of pre implemented
ML algorithms and TensorFlow provides full flexibility for implementing any
type of model architecture.
Most machine learning pipelines
execute the components in a sequential manner leading to all the components to
be re-executed with the growth of data
to be fed. This becomes a bottleneck
since most of the real world use cases require continuous training. Preadictio
supports several continuation strategies that result from the interaction
between data visitation and warm-starting options.
Easy-to-use configuration and tools.
Providing an admin and
configuration framework is only possible if components also share utilities
that allow them to communicate and share assets. A Praedictio user is only
exposed to one admin panel to manage all components.
Production-level reliability and scalability.
Only a small fraction of a
machine learning platform is the actual code implementing the training
algorithm. If the platform handles and encapsulates the complexity of machine
learning deployment, engineers and scientists have more time to focus on the modeling
Product Road Map
The Praedictio platforms road map has been envisioned to
deliver the core components in an iterative manner.
MVP – Prediction Serving System and API gateway
Alpha – Training pipeline , Admin panel
Beta – Action Engine
Figure 1 shows a high-level component overview
and architecture of the machine learning platform and highlights the components
discussed in the following sections: