Introduction to the RSP Notebook Aspect¶
Most RSP users will find Jupyter Notebooks to be the most efficient and powerful way to interact with the DP0.1 data set. For DP0.1, Jupyter Notebooks will be the only way to access images.
Log In and Out of JupyterLab¶
From the RSP landing page at data.lsst.cloud click on the central panel for Notebooks.
Software Version and Server Size: The first page offers a choice of software environment version (left) and server size (right), as in the figure below. Most users will choose the recommended software version and a medium server size.
The term “image” atop the left box refers to a “Docker image” that defines the software packages and their versions which will be automatically loaded in the server environment. The “recommended” image will be updated on a monthly basis during DP0.1 to encourage users to adapt to using software that is an active development, and to benefit from the bug fixes and updates made by Rubin staff. Older images will remain accessible to users.
RSP users who are doing a lot of image processing might need to select a large server, and those who are working with small subsets of catalog data can use a small server.
Start the Server: Pressing the orange start button to start the server returns this page with a blue progress bar.
JupyterLab Navigation: The JupyterLab landing page in the figure below is the launch pad for all JupyterLab functionality (e.g., Notebook, Terminal). Return to this launch pad at any time by clicking the plus symbol at upper left. For DP0.1, only Python Notebooks and the terminal interface are supported (i.e., not the Console), and users are not able to access Portal queries from a Jupyter Notebook.
In the very left-most vertical sidebar of icons, the top icon is a file folder, and that is the default view. The left sidebar lists folders in the user’s home directory (e.g., DATA, WORK, and notebooks). Launching a terminal will put you into the linux environment, running bash. From there, using the command “ls” will return the same list. Navigate the file system and open files by double-clicking on folders and files in the left sidebar. All users will find a set of tutorial notebooks provided in the notebooks directory (Examples-DP0-1). Jupyter Notebooks can be identified by their suffix “.ipynb”.
Safely Log Out of JupyterLab: Use the File item in the top menu bar. To safely shut down a Notebook, choose Close and Shutdown Notebook. To safely shut down a JupyterLab server and log out of the RSP, choose Save all, Exit, and Log Out.
How to Use a Jupyter Notebook¶
The best way to learn how to use a Jupyter Notebook is to open a tutorial notebook in the RSP and follow the directions. The “first step” notebook designed for this purpose is the one that starts describing Jupyter notebooks, specifically /notebooks/tutorial-notebooks/01_Intro_to_DP0_Notebooks.ipynb - you can get to it by clicking on the “Folder” icon, then on the “notebooks” line, and then on the “tutorial-noteboks” line. You will start the noteboob by clicking on the 01_Intro_to_DP0_Notebooks.ipynb line. Note that you can have multiple notebooks open simultaneously, each with its own tab.
Jupyter notebooks provide “cells” within which you type either Python code or markdown. When you execute the cell (by either typing _Return_ while simultaneously holding down the _Shift_ key, or by clicking a button loking like a triangle pointing to the right, in the ribbon just below the tabs for individual noteboks), the contents of the cell will be executed. If you typed Python code in the cell, then the code will be executed. If you entered markdown, then it will be rendered upon execution to yield nicely formatted text (for some handy markdown tips, see this blog post or the relevant section from the JupyterLab documentation).
What is a kernel? In the RSP Notebook Aspect, your notebooks will be operating in a kernel that has access to the full Rubin Science Pipelines, including the “Butler” (see below) that will be your main access tool to extract images and catalogs from the DP0.1 data repository. Many standard Python libraries and modules will be available, and users can install additional Python tools they wish to use. In DP0.1, the Notebook Aspect will not offer access to queries from the Portal Aspect.
Is all the code in Python? To access DP0.1 data from the Notebook Aspect, users will need to use Python commands and code. We have provided many tutorial notebooks to help you get started. One feature of Jupyter notebooks is that these tutorials are not just text that you read, but contain executable examples of the commands required to access and analyze data. Thus they are a resource not only for learning how to use Rubin tools and science pipelines, but can serve as “seeds” from which you can borrow lines of code and alter them to suit your purposes. Because some facility with programming – and in particular with Python – will be necessary to make use of the Notebook Aspect, we recommend that users who are unfamiliar with Python learn some basics. There are countless resources on the internet to help you learn Python – here are a few that we recommend:
GETTING STARTED WITH PYTHON BY PYTHON.ORG https://www.python.org/about/gettingstarted/
SET OF PYTHON LESSONS FROM GURU99 https://www.guru99.com/python-tutorials.html
What is the Butler? The only way to access DP0.1 images is via the Butler (a middleware component of the DMS for persisting and retrieving image datasets) from a Jupyter Notebook. The third generation “Gen3” Butler is the version being used for DP0.1. Full `Butler documentation<https://pipelines.lsst.io/modules/lsst.daf.butler/index.html>`_. The Section 3.1 of the notebook mentioned above, /notebooks/tutorial-notebooks/01_Intro_to_DP0_Notebooks.ipynb illustrated the use of Butler. A more extensive tutorial notebook aimed to provide more details is /notebooks/tutorial-notebooks/04_Intro_to_Butler.ipynb
Where to get support If you are not experienced at accessing data via Jupyter notebooks, or using a Science Platform more generally, you are not alone. The Rubin Observatory Community forum provides a searchable, community-based discussion platform that you can use as a resource to ask questions of the worldwide Rubin user community, to share your own tips and analyses, and discuss all aspects of Rubin Observatory, including using the RSP and the Rubin Science Pipelines, understanding the data products, and discussing Rubin-related science. Specifically, you might find it useful to attend one of the “Delegate Assemblies: listed in the DP0 Delegates homepage.
Additional RSP Notebook Documentation¶
Additional documentation for the RSP Notebook Aspect deployed at the National Center for Supercomputing Applications (NCSA) is available at nb.lsst.io, but beware that much of it does not apply to the RSP Notebook Aspect deployed in the Google Cloud for DP0.