Creating a Notebook to model data

You can export data from Data Explorer to a Notebook and work on this data by coding in several programming languages.

Date: October 30, 2019 | Wizata platform version 3.10.6

The Data Points on the Wizata platform that are visualised on Data Explorer can be exported to a Notebook to work deeper on them, perform analysis and even train AI models and deploy them.

A notebook is a web-based interface to a document that contains runnable code, visualizations, and narrative text.  The code on the Notebook is executed by a cluster powered by Microsoft Azure Databricks.

These are the steps to create your own Notebook:

1a - Create a Notebook from Data Explorer.

After exploring the Data Points on Data Explorer, click on the icon "Create Notebook" at the menu at the top of the charts.

1b - (Alternative) Create a Notebook from Research & Development module

An alternative way to create a Notebook is from the  Research & Development module, in the section Jobs there is a "+ New" button on the top right on the screen.

2 - Define the type of Notebook you want to create.

After clicking on the previous icon, a menu will pop up that will allow you to customise your notebook:

 

Going through the menu:

2.1 - Name:

This will be the name of the job. This job will manage the data export and the creation of the Notebook.

2.2 - Command:

Select "CreateNotebook"

2.3 - Data Points:

Select the Data Points to be studied. You can see how they look like on Data Explorer before creating the Notebook.

If you need to be introduced to Data Explorer, check the following article:

https://knowledge.wizata.com/getting-started-with-the-data-explorer

Once you know the Data Points you want to analyse in the Notebook, select them from the list:

At this point, you will have to choose the function used to resample the data in case the sampling Interval is greater than the raw data (Avg, Min, Max, EvenCount, Sum, Standard Deviation, First or Last). To know more about this function: https://knowledge.wizata.com/changing-the-data-aggregation-type

2.4 - Date Interval:

Select the interval of dates for the data to be exported to the Notebook:

2.5 - Interval:

Here you can choose the sampling Interval. You can define by typing on each box how many years ("y"),  months ("m"), weeks ("w"), days ("d"), hours ("h"), minutes ("m") or  seconds ("s") the sampling interval consists of. Also you can obtain data summarised by production "Batch".

As an example, if the raw data has one second granularity and you choose the Interval 1 hour ("h"), you will export hourly Data Points to your Notebook, and the function used to summarise the data from seconds data to hourly data will be the function chosen on step 3.

2.6 - Projects:

Here you can assign the Notebook to an existing Project.

2.7 - Name:

This will be the name of your Notebook.

2.8 - Environment:

Choose the option "Azure Databricks".

2.9 - Language:

In this field you can select the coding language of your Notebook.

2.10 - Format:

Regarding the format of your Notebook, there are 4 options:

Select the option "Jupyter". Jupyter is an open source and user friendly format that allows coding and visualising.

HTML would be used to visualise on the Platform an existing Notebook.

DBC allows to export the Notebook as a data base in the Data Base Container format.

Source allows to get the content of the Notebook in the JSON format.

2.11 - Description:

You can optionally introduce a short description about the purpose of the Notebook.

2.12 - Template Notebook:

In this section you can choose a template for your Notebook. This template will contain the basic code to start working and even some preliminary results.

The default template is available in the dropdown menu. This will perform a preliminary analysis of the exported data. Nevertheless, new templates can be customised accordingly with the requirements of the analysis and uploaded into the platform.

 

2.13 - Include Storage connectivity information:

By enabling this option, you will get in your Notebook the code containing the credentials to access the Data Points you have selected.

Finally, save the changes by clicking on the Save button on the bottom-right of the menu.

 

3 - Access your Notebook

After one minute, you can find the Notebook you created on the left menu under the module Research & Development, section Notebooks.

Look for your Notebook in the list and click on the Hyperlink to be redirected to the Notebook on Azure Databricks, where you will be able to start working on it.