Study the Business DomainWe start by building a high-level understanding of your company.
Define the Business ProblemIs it about increasing click-through rates? Identifying an ideal location for your store? Whatever the objective, it is clearly framed.
Identify Data SourcesQualitative or quantitative, structured or unstructured, archived or streaming, data relevant to the analysis and their sources are identified.
Data Cleansing & TransformationThe most labor-intensive step, this involves preprocessing data based on the analysis requirement. Checking for missing values, removing inconsistencies, normalizing datasets, we make sure the data is ready for exploration.
Tools: Data Wrangler, OpenRefine
Exploratory Data Analysis (EDA)At this stage, we get an intuition of the data, identify important variables and their relationships, detect outliers, check assumptions, and choose relevant techniques for modeling. Graphical statistical techniques from the simple histogram and scatter plot to probability plot and seasonal subseries plot are used.
Tools: SPSS, Weka, R
Based on the insights from EDA and the requirement, we develop predictive or descriptive models—from simple regression to deep learning. After multiple iterations, the most suitable model is created. These models are then validated with the test data and approved by our domain experts.
Tools: R, Python, Scikit, TensorFlow
Using statistical descriptions and visualization techniques, we communicate the key findings from the models implemented. With the help of interactive tools, business users can change the variables in the model and explore potential outcomes to strategize or refine their decisions.
Tools: Gephi, D3.js, GGobi