Setting Up Anaconda And Jupyter Notebook On GNU/Linux

Table of Contents

The Jupyter Notebook is an open-source interactive web application developed by Python language. The official recommends installing Python and Jupyter Notebook using the Anaconda Distribution. This article documents how to set up Anaconda and Jupyter Notebook, and implement the entire process through a shell script.

Introduction

We strongly recommend installing Python and Jupyter using the Anaconda Distribution, which includes Python, the Jupyter Notebook, and other commonly used packages for scientific computing and data science. – Installing Jupyter

Anaconda

Anaconda Distribution is the easiest way to do Python data science and machine learning. It includes 250+ popular data science packages and the conda package and virtual environment manager for Windows, Linux, and MacOS. Conda makes it quick and easy to install, run, and upgrade complex data science and machine learning environments like Scikit-learn, TensorFlow, and SciPy. Anaconda Distribution is the foundation of millions of data science projects as well as Amazon Web Services’ Machine Learning AMIs and Anaconda for Microsoft on Azure and Windows. – https://www.anaconda.com/what-is-anaconda/

Jupyter

The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and much more. – https://jupyter.org/

Official Document

Shell Script

The entire installation and configuration process has been implemented through a shell script, the code is hosted on GitLab, usage info

1
2
3
4
# curl -fsL / wget -qO-

# if need help info, specify '-h'
curl -fsL https://gitlab.com/MaxdSre/axd-ShellScript/raw/master/assets/software/Anaconda.sh | sudo bash -s --

To facilitate managing Jupyter Notebook, I set up some command aliases in ~/.bashrc, as follows

1
2
3
4
5
6
7
# jupyter notebook Start
alias jnl="jupyter notebook list | sed '/running servers/d'"
alias jnb="(nohup jupyter notebook &> /dev/null &); sleep 2; jnl"
alias jne="ps -ef | awk 'match(\$0,/(jupyter-noteboo|Anaconda\/bin)/)&&!match(\$0,/awk/){print \$2}' | xargs kill -9 2> /dev/null"
alias jni="curl -fsL https://gitlab.com/MaxdSre/axd-ShellScript/raw/master/assets/software/Anaconda.sh | sudo bash -s -- -U"
alias jnr="sudo /opt/Anaconda/bin/conda remove"
# jupyter notebook End
  • jnl:list running server;
  • jnb:start new server;
  • jne:stop all started server;
  • jni:search, install, update package via conda, use command jni a to update entire conda environment;
  • jnr: remove package via conda

Anaconda

Official download page is https://www.anaconda.com/download, it supports both Python 3.6 and 2.7. Choosing the corresponding version according to your needs.

Release Version

The latest release version of Anaconda is 5.2.

You can use the following command to extract the latest version information

1
curl -fsL https://www.anaconda.com/download/ | sed -r -n '[email protected]<\/[^>]+>@\[email protected];p' | sed -r -n '/>Release Date:/{[email protected][[:space:]]*<[^>]*>[[:space:]]*@@g;[email protected]^[^:]*:[[:space:]]*(.*)@\[email protected];p}; /Anaconda.*Linux-x86_64/{/Installer/{[email protected]*href="([^"]*)".*@\[email protected];[email protected]*Anaconda[^-]*-([^-]*).*[email protected]\[email protected];p;q}}' | sed ':a;N;$!ba;[email protected]\[email protected]|@g'

Output results

February 15, 2018|5.1.0

1
May 30, 2018|5.2.0

Verification

Anaconda doesn’t provide hash verification info for the package directly on its download page. The relevant information is stored on the page Anaconda installer file hashes. The page Hashes for all files lists the sha256 hash values of the historical versions of Anaconda.

Here use Anaconda3-5.2.0-Linux-x86_64.sh as an example, the page Hashes for Anaconda3-5.2.0-Linux-x86_64.sh lists the installation package information.

item details
Last Modified 2018-05-30 13:05:43
size(byte) 651745206
md5 3e58f494ab9fbe12db4460dc152377b5
sha256 09f53738b0cd3bb96f5b1bac488e5528df9906be2480fe61df40e0e0d19e3d48

Hash check can be performed by the following command

1
2
3
4
5
6
7
file_path='~/Downloads/Anaconda3-5.2.0-Linux-x86_64.sh'

# via sha256sum
sha256sum "${file_path}"

# via openssl
openssl dgst -sha256 "${file_path}"

Demonstration example

1
2
3
4
5
6
7
8
┌─[[email protected]][~/Downloads]
└──╼ $sha256sum Anaconda3-5.2.0-Linux-x86_64.sh
09f53738b0cd3bb96f5b1bac488e5528df9906be2480fe61df40e0e0d19e3d48  Anaconda3-5.2.0-Linux-x86_64.sh
┌─[[email protected]][~/Downloads]
└──╼ $openssl dgst -sha256 Anaconda3-5.2.0-Linux-x86_64.sh
SHA256(Anaconda3-5.2.0-Linux-x86_64.sh)= 09f53738b0cd3bb96f5b1bac488e5528df9906be2480fe61df40e0e0d19e3d48
┌─[[email protected]][~/Downloads]
└──╼ $

Installation

After the sha256 check is passed, refer to the official document https://docs.anaconda.com/anaconda/install/ for installation.

Run the following command to install

1
bash ~/Downloads/Anaconda3-5.2.0-Linux-x86_64.sh

By default, the installing process of Anaconda is interactive which requires user interaction , details in Installing on Linux.

It also supports silent mode installation, according to official documents

Specifing flag -b to make it into silent mode, you can also specify flag -p to custom installation path, as explained below:

  1. -b: Batch mode with no PATH modifications to ~/.bashrc. Assumes that you agree to the license agreement. Does not edit the .bashrc or .bash_profile files.
  2. -p: Installation prefix/path.
  3. -f: Force installation even if prefix -p already exists.

Here use installation path /opt/Anaconda as an example, the installation command is

1
2
installation_dir='/opt/Anaconda'
bash ~/Downloads/Anaconda3-5.2.0-Linux-x86_64.sh -b -f -p ${installation_dir}

$PATH

After Anaconda is installed, it still can’t directly execute the command conda. The reason is that the executable path ${installation_dir}/bin/ is not in the environment variable $PATH. You need to use ${installation_dir}/bin/activateadd to add into $PATH.

Anaconda just adds the following directive into file ~/.bashrc.

1
export PATH="${installation_dir}/bin:$PATH"

But personal advice is place it under directory /etc/profile.d/, so that other users can also use conda.

Run the following command to update

1
2
3
4
# 1 - in $PATH
conda update conda
# 2 - absolute path
/opt/Anaconda/conda update conda

Jupyter

As Anaconda includes Jupyter Notebook, what you need to do is change its default configuration.

Jupyter Notebook listens on the default port 8888, logging in via a password or token.

Generate the configuration file through the following command

1
jupyter notebook --generate-config

The generated configuration file path

1
~/.jupyter/jupyter_notebook_config.py

Details in Configuration Overview

The important directive

  • NotebookApp.allow_root
  • NotebookApp.base_url
  • NotebookApp.ip
  • NotebookApp.port
  • c.NotebookApp.password
  • NotebookApp.allow_password_change
  • NotebookApp.notebook_dir
  • NotebookApp.certfile
  • NotebookApp.disable_check_xsrf

For remote access, you need to open the Jupyter port (default is 8888) in firewall rule.

Password Generation

Jupyter Notebook uses an interactive way to generate hashed password.

Interactive mode

Official document Alternatives to token authentication mentioned

New in version 5.0: jupyter notebook password command is added.

The generated hashed password stores in file

1
~/.jupyter/jupyter_notebook_config.json

The demo process is as follows (raw password [email protected]_Python)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
┌─[[email protected]][~]
└──╼ $jupyter notebook password
Enter password:
Verify password:
[NotebookPasswordApp] Wrote hashed password to /home/maxdsre/.jupyter/jupyter_notebook_config.json
┌─[[email protected]][~]
└──╼ $cat ~/.jupyter/jupyter_notebook_config.json
{
  "NotebookApp": {
    "password": "sha1:4b55a390f103:1e3bf1e48ca12fda8a0fad32799fc0ad82e0301c"
  }
}┌─[[email protected]][~]
└──╼ $

Silent Mode

But the interactive mode is not conducive to the automatic operation of the shell script. If there is a possibility to generate hashed password in silent mode?

Here use Python 3 as an example, the functions used by Jupyter Notebook to generated hashed password are passwdpasswd_checkset_passwordpersist_config, they lists in file

1
${installation_dir}/lib/python3.6/site-packages/notebook/auth/security.py

Extracting the core codes into file /tmp/passwd.py, here still use the raw password [email protected]_Python as an example.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
# For Python 3.6
import hashlib
import random
from ipython_genutils.py3compat import cast_bytes, str_to_bytes
from notebook.auth.security import passwd_check

salt_len = 12

def passwd(password,algorithm='sha1'):
    h = hashlib.new(algorithm)
    salt = ('%0' + str(salt_len) + 'x') % random.getrandbits(4 * salt_len)
    h.update(cast_bytes(password, 'utf-8') + str_to_bytes(salt, 'ascii'))
    return ':'.join((algorithm, salt, h.hexdigest()))

pass_str='[email protected]_Python'
result_str=passwd(pass_str)

# print hashed passwd
print(result_str)

# check if is legal
print(passwd_check(result_str,pass_str))

executing the following command to generated hashed password

1
jupyter run /tmp/test.py

operating process

1
2
3
4
5
6
┌─[[email protected]][/tmp]
└──╼ $jupyter run /tmp/test.py
sha1:492e18c9b198:5c00c6ea49e8426766ffb698ced3827e579369c6
True
┌─[[email protected]][/tmp]
└──╼ $

The verification result is True.

However, there is a problem with this: Anaconda supports both Python 3 and 2, there are 2 security.py copies need to be processed separately. If Jupyter Notebook changes the code, the shell script must also be changed accordingly. I’m not ensure that the script can be updated in time. Based on this consideration, this function (custom password setting) was not added to the shell script.

SSL

To improve the security of data transmission, SSL certificates can be configured. More details in Using SSL for encrypted communication.

Here I use command openssl to create a self-signed SSL certificate. This process is also in interactive mode. However, you can disable it by flag -subj.

Shell code sample

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
jupyter_name='Jupyter'
jupyter_conf_dir="/tmp/${jupyter_name}"

self_cert_path=${self_cert_path:-"${jupyter_conf_dir}/${jupyter_name}.pem"}
# openssl req -x509 -nodes -days 365 -newkey rsa:1024 -keyout mycert.pem -out mycert.pem
[[ -d "${jupyter_conf_dir}" ]] || mkdir -p "${jupyter_conf_dir}"
if [[ ! -f "${self_cert_path}" ]]; then
    cert_C=${cert_C:-'CN'}    # Country Name
    cert_ST=${cert_ST:-'Shanghai'}    # State or Province Name
    cert_L=${cert_L:-'Shanghai'}    # Locality Name
    cert_O=${cert_O:-'MaxdSre'}    # Organization Name
    cert_OU=${cert_OU:-'Python'}    # Organizational Unit Name
    cert_CN=${cert_CN:-'jupyter.org'}    # Common Name
    # https://jupyter.org/community
    cert_email=${cert_email:-'[email protected]'}    # Email Address

    cert_C="${ip_public_country_code}"
    cert_ST="${ip_public_locate%%.*}"
    cert_L="${ip_public_locate##*.}"

    # expire date 3650 days, crypt type RSA, key length 4096
    openssl req -x509 -nodes -days 3650 -newkey rsa:4096 -keyout "${self_cert_path}" -out "${self_cert_path}" -subj "/C=${cert_C}/ST=${cert_ST}/L=${cert_L}/O=${cert_O}/OU=${cert_OU}/CN=${cert_CN}/emailAddress=${cert_email}" 2> /dev/null
    # openssl rsa -in "${self_cert_path}" -text -noout 2> /dev/null
    [[ -f "${self_cert_path}" ]] && chmod 644 "${self_cert_path}"
fi

The testing process is as follows

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
# bash -x ssl.sh
+ jupyter_name=Jupyter
+ jupyter_conf_dir=/tmp/Jupyter
+ self_cert_path=/tmp/Jupyter/Jupyter.pem
+ [[ -d /tmp/Jupyter ]]
+ mkdir -p /tmp/Jupyter
+ [[ ! -f /tmp/Jupyter/Jupyter.pem ]]
+ cert_C=CN
+ cert_ST=Shanghai
+ cert_L=Shanghai
+ cert_O=MaxdSre
+ cert_OU=Python
+ cert_CN=jupyter.org
+ cert_email=[email protected]
+ cert_C=
+ cert_ST=
+ cert_L=
+ openssl req -x509 -nodes -days 3650 -newkey rsa:4096 -keyout /tmp/Jupyter/Jupyter.pem -out /tmp/Jupyter/Jupyter.pem -subj /C=/ST=/L=/O=MaxdSre/OU=Python/CN=jupyter.org/emailAddress=[email protected]
+ [[ -f /tmp/Jupyter/Jupyter.pem ]]
+ chmod 644 /tmp/Jupyter/Jupyter.pem

Jupyter Extension

In order to enhance the function of Jupyter Notebook, you could consider installing extension jupyter_contrib_nbextensions. More details in Unofficial Jupyter Notebook Extensions.

1
2
3
4
5
# Install the python package
conda install -c conda-forge jupyter_contrib_nbextensions

# Install javascript and css files
jupyter contrib nbextension install --user

Testing

Command Line

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
┌─[[email protected]][~]
└──╼ $jni a
Solving environment: done

# All requested packages already installed.

Finishing executing conda update --all for Anaconda!

┌─[[email protected]][~]
└──╼ $jnl
┌─[[email protected]][~]
└──╼ $jnb
https://127.0.0.1:33525/Jupyter/?token=2709e9966fe2772e00a76ebfddfc12ac3d544eef518c530c :: /home/maxdsre/Jupyter
┌─[[email protected]][~]
└──╼ $jnl
https://127.0.0.1:33525/Jupyter/?token=2709e9966fe2772e00a76ebfddfc12ac3d544eef518c530c :: /home/maxdsre/Jupyter
┌─[[email protected]][~]
└──╼ $jne
┌─[[email protected]][~]
└──╼ $jnl
┌─[[email protected]][~]
└──╼ $

Web Browser

Private SSL

Reference

Change Logs

  • 2018.04.19 10:42 Wed America/Boston
    • first draft
  • 2018.07.11 11:38 Wed America/Boston
    • update version to 5.2
Show Disqus Comments