Zeppelin Notebooks Design Document

This design document aims at improving the current Zeppelin Web with enhanced modern UI framework with feature-rich interface with very minimal backend changes. It will be based on Vue.js framework.

This effort will improve in the following areas:[a][b]

  • UI Framework - Vue.js
  • IDE like Interface
  • UI Snapiness
  • Performance Enhancements

This document also lists all the backend changes required and items which should be renamed and items which can not be supported / deleted from this proposal at the end of the document.

Current prototype is available at  https://github.com/malayhm/zeppelin-studio

Current Problems faced by Zeppelin users

  1. Notebook Loading becomes slow with large notebook ( large output)

        Notebook output could be large not only in terms of rows but based on data or number of columns as well

  1. Current codebase was written long back with old version of Angular framework hence it lacks innovation and latest methodologies
  2. UI is cluttered with so many options
  3. Interpreter concept is very confusing for normal users who don’t want to get into the complexity of administration, hence this design will take care of UX aspects without touching the behaviour. We should think about simplifying the interpreter concept altogether separately.
  4. Editor behaves weird when performing certain operations like add paragraph, running notebook with large paragraph

  1. Currently only a single user can edit a notebook at a time, otherwise glitches occur. We should aim for Google Docs style collaborative editing.
  2. We lack a ‘top level’ concept which groups notebooks/interpreter-settings/configs/access-permissions. Users would benefit from a “project abstraction”, which also can act as the base for ACL, and the homepage.
  1. https://issues.apache.org/jira/browse/ZEPPELIN-3576

IDE Interface

The proposed interface will be like any other IDE having multiple components which will make a user’s life easy by providing ease of use and productive environment.

It will have the following components:

  1. Top Navigation Bar - Menu System
  2. Left Sidebar
  3. Tabbed Layout
  4. Status Bar

Top Navigation Bar - Menu System

Top Navigation Bar will allow users to discover all the available options in place without too many scattered options.

  • File, Edit and View will have all Notebook and Paragraph related options.
  • Runtime will have run and Interpreter options.
  • Settings will have all the configuration related to Notebook Repo, Credentials, etc.
  • Tools will have Preferences and Keyboard shortcut which can be useful at times

Note: As most of the data science users’ don’t want to know what interpreter is how to deal with it, we can move it in the Runtime menu  so that advanced users can use it. With the default configuration, it should make sure that basic users can use the Notebook setup without any hurdles.

Left Sidebar

With New left sidebar, a user can see the list of notebooks without losing the context of the open Notebook.

The left sidebar will be divided into 3 sections:

  1. Left navbar
  1. Folders: It will allow us to show the folder tree on the sidebar
  2. Activity Console: Jobs are renamed as activity as it conveys scheduled jobs. But Activity is powerful which gives insight about what is going on across the notebooks.[c][d]
  3. Helium: Package Manager[e][f]

        There will be Preferences button on the bottom of the navbar which will open the Preferences panel which can quickly let a user choose settings.

     2)   Action Area for the active navbar

All the navbar will have  options on top: search, create, etc.

     3)  Active Navbar content

        List of items based on the active area, folder tree, activity list, package list, etc.

Item specific options will be available as a dropdown with each item.

It will also have a context menu depending on the active section.

Left Sidebar will be resizable and will have an option to collapse/expand the Action and Content area.

Table of Content

We also want to incorporate Table of content on the sidebar for the opened / active Notebook. Basically list of Notebook Paragraphs identified by title / id.

From ToC, a user can get a glimpse of the number of paragraphs and status of each paragraph along with it.

A user can also perform paragraph actions which will provide ease of use, e.g. rearrange paragraphs, delete, etc.

Tabbed Layout

        A user can open multiple notebooks in a single browser tab which will allow them not to get lost in different browser tabs, as well as sidebar context will be available for ease of use.

        Interpreter list will also be opened as a separate tab (single instance).

Status Bar

Status Bar will be handy to show various details which are helpful and handy at times, e.g. Websocket connection details, toggle dark mode, etc.

We can also put the current paragraph id / line number which will help a user get an idea about the current editor.


Theming

Now, users can switch between light and dark mode to adjust the UI color scheme.

For the customization point of view, folks can come up with their customized theme to match their own UI.


Notebook Editor UX

Proposed Notebook Editor will be decluttered without boxes everywhere.

Each notebook will open in their own tab on the right-hand side content.

Notebook Controls

There will be a dedicated panel in each opened notebook with the following options: explicit save button, versions, layout type, etc.

Paragraph

  • It will have two kinds of paragraph types: code(current) and text (md).

  • There will be separate options to add a paragraph for each type between two paragraphs.
  • Run paragraph button is moved on the left-hand side so that it will have a better navigation and can also show the running status from their itself.
  • There will be a dropdown menu with each paragraph for all the possible actions.

Interpreter UX

Interpreter is also very advanced feature for basic Notebook users hence simplification is required in terms of overall feature, but this design change will aim at improving the UX aspect rather than complete feature improvement.

The Interpreter will be in the top navbar that Advance users can use it.

It will open as a separate tab with a list of interpreters on the left-hand side of the content and list of properties for the selected interpreter on the right-hand side.

There will be a search on the top of the content.

UI Snapiness & Performance Improvements

  1. Asynchronous Paragraph Addition

        At present, if a user clicks on the `Add Paragraph` button, UI will send a request over WebSocket connection and backend will take care of the paragraph addition which will show the newly added paragraph eventually and hence it will take sometime before it will be visible to the user.

Even though time will be less compared to other API operations, it will take some time which will get increased based on the Notebook size.

  1. Trim Output

At present Zeppelin allows viewing of up to 1000 rows and all the columns of the paragraph output. Considering Notebooks are used for Data science to slice and dice Big Data problems, multiple paragraphs in a single notebook can bloat the content which will result in slowness on the browser.

Considering, a limited set of content is visible on the viewport and absence of spreadsheet-like features will not make any sense to show the full data on the browser.


Trimming of output not only in terms of a number of rows but also in terms of columns can affect the notebook performance adversely as columns having huge data, e.g. description, etc. can be huge.

Proposal:

  1. Show all the columns only when a user specifies each column explicitly in the paragraph request
  2. Trim columns and rows and show ellipsis in between beyond a certain limit.
  3. All the limits has to be implemented on the backend and not on the UI to reduce impact

Backend Changes Required

  1. At present all the notebook operations are performed asynchronously assuming only one notebook will be opened hence there will be no request identification by Notebook Id, this can cause a problem if a user switches notebooks before response from the previous Notebook is received.

  1. As proposed in this design, Tabbed layout will bring a lot of benefits of IDE which will allow a user to perform certain operations without losing the context.[g][h]

    But as mentioned in point #1, multi-tabbed behavior will need request and response identification in order to show the correct updates and perform operations across the opened notebooks.
  2. We have to change the package management icon loading

Items removed / renamed / added

  1. Grid layout will be removed
  2. There will be no angular interpreter support
  3. Interpreter binding is going to be removed in the 0.9, hence it will not be available here as well
  4. Top search bar will be removed
  5. Jobs will be renamed as Activity Console
  6. Themes will be added
  7. Preferences will be added across the Notebooks
  8. Should we rename Zeppelin Notebooks interface to Zeppelin Studio for this project?

In future, we can plan to move this project as a cross-platform electron app and an option to connect to any supported zeppelin server with authentication mechanism in place will allow us to get rid of all the browser related limitations.

Tools and Technologies

UI Framework: Vue.js

Design System, Components, Icons: Ant Design

[a]Could you also list any backend changes that this proposal depends on ? Or could you just list the behavior change compared to the current zeppelin so that I can help figure out what needs to be changed.

[b]I have added a separate section for the backend changes at the end of this document.

[c]We can keep it for now. But I believe the jobs related work needs to be redesigned, so don't spend too much time on this.

[d]Yes, should we think about redesigning it in this proposal or separately?

If the redesign is needed in terms of behavior, then it has to go separately, but if it's in terms of UX redesign, then it can be combined in this proposal.

[e]Will Helium still work in the new frontend framework ?

[f]We have to make it work in the new front end framework but the effort is still unknown.

[g]Will this cause memory pressure on frontend ?

[h]We can try to get the memory footprint and see the impact but if we decide to limit the output not only in terms of rows but columns as well, then it should not be a big impact. We can revisit this in P3 item list as well.