在GNU/Linux中安裝配置Anaconda和Jupyter Notebook

文章目錄

Jupyter Notebook是一款開源的交互式Web應用,使用Python語言開發。其官方建議通過Anaconda安裝PythonJupyter Notebook。本文記錄如何通過Anaconda安裝、配置Jupyter Notebook,並通過Shell腳本實現整個過程。

簡介

We strongly recommend installing Python and Jupyter using the Anaconda Distribution, which includes Python, the Jupyter Notebook, and other commonly used packages for scientific computing and data science. – Installing Jupyter

Anaconda

Anaconda Distribution is the easiest way to do Python data science and machine learning. It includes 250+ popular data science packages and the conda package and virtual environment manager for Windows, Linux, and MacOS. Conda makes it quick and easy to install, run, and upgrade complex data science and machine learning environments like Scikit-learn, TensorFlow, and SciPy. Anaconda Distribution is the foundation of millions of data science projects as well as Amazon Web Services’ Machine Learning AMIs and Anaconda for Microsoft on Azure and Windows. – https://www.anaconda.com/what-is-anaconda/

Jupyter

The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and much more. – https://jupyter.org/

官方文檔

Shell 腳本

整個安裝、配置過程已通過Shell腳本實現,代碼託管在GitLab,通過如下命令執行

1
2
3
4
# curl -fsL / wget -qO-

# if need help info, specify '-h'
curl -fsL https://gitlab.com/MaxdSre/axd-ShellScript/raw/master/assets/software/Anaconda.sh | sudo bash -s --

爲方便管理Jupyter Notebook,本人在腳本中設置了一些命令別名,存放在文件~/.bashrc中,具體如下

1
2
3
4
5
6
7
# jupyter notebook Start
alias jnl="jupyter notebook list | sed '/running servers/d'"
alias jnb="(nohup jupyter notebook &> /dev/null &); sleep 2; jnl"
alias jne="ps -ef | awk 'match(\$0,/(jupyter-noteboo|Anaconda\/bin)/)&&!match(\$0,/awk/){print \$2}' | xargs kill -9 2> /dev/null"
alias jni="curl -fsL https://gitlab.com/MaxdSre/axd-ShellScript/raw/master/assets/software/Anaconda.sh | sudo bash -s -- -U"
alias jnr="sudo /opt/Anaconda/bin/conda remove"
# jupyter notebook End
  • jnl:查看系統中已運行的server;
  • jnb:啓動新的server;
  • jne:關閉所有已啓動的server;
  • jni:用於安裝包的搜索、安裝和更新,jni a可對所有已安裝的包進行更新。
  • jnr:移除安裝包

Anaconda

Anaconda 下載頁面爲 https://www.anaconda.com/download,同時支持Python 3.62.7,根據需要選擇下載對應版本的Anaconda

版本信息

Anaconda當前最新釋出版本爲5.3

可通過如下命令提取最新版本信息

1
curl -fsL https://www.anaconda.com/download/ | sed -r -n '[email protected]<\/[^>]+>@\[email protected];p' | sed -r -n '/>Release Date:/{[email protected][[:space:]]*<[^>]*>[[:space:]]*@@g;[email protected]^[^:]*:[[:space:]]*(.*)@\[email protected];p}; /Anaconda.*Linux-x86_64/{/Installer/{[email protected]*href="([^"]*)".*@\[email protected];[email protected]*Anaconda[^-]*-([^-]*).*[email protected]\[email protected];p;q}}' | sed ':a;N;$!ba;[email protected]\[email protected]|@g'

輸出結果

February 15, 2018|5.1.0 May 30, 2018|5.2.0

1
November 19, 2018|5.3.1

校驗

Anaconda 並未直接在下載頁面提供安裝包的hash校驗信息,相關信息存放在頁面 Anaconda installer file hashes。其中頁面 Hashes for all files 列出了Anaconda各歷史版本的sha256hash值。

此處以Anaconda3-5.3.1-Linux-x86_64.sh爲例,頁面 Hashes for Anaconda3-5.3.1-Linux-x86_64.sh列出了安裝包的相關信息。

item details
Last Modified 2018-11-19 13:38:46
size(byte) 667976437
md5 334b43d5e8468507f123dbfe7437078f
sha256 d4c4256a8f46173b675dd6a62d12f566ed3487f932bab6bb7058f06c124bcc27

可通過如下命令進行Hash校驗

1
2
3
4
5
6
7
file_path='~/Downloads/Anaconda3-5.3.1-Linux-x86_64.sh'

# via sha256sum
sha256sum "${file_path}"

# via openssl
openssl dgst -sha256 "${file_path}"

演示示例

1
2
3
4
5
6
7
8
┌─[[email protected]][~/Downloads]
└──╼ $sha256sum Anaconda3-5.3.1-Linux-x86_64.sh
d4c4256a8f46173b675dd6a62d12f566ed3487f932bab6bb7058f06c124bcc27  Anaconda3-5.3.1-Linux-x86_64.sh
┌─[[email protected]][~/Downloads]
└──╼ $openssl dgst -sha256 Anaconda3-5.3.1-Linux-x86_64.sh
SHA256(Anaconda3-5.3.1-Linux-x86_64.sh)= d4c4256a8f46173b675dd6a62d12f566ed3487f932bab6bb7058f06c124bcc27
┌─[[email protected]][~/Downloads]
└──╼ $

安裝

sha256校驗通過後,參照官方文檔 https://docs.anaconda.com/anaconda/install/ 進行安裝。

執行如下命令進行安裝

1
bash ~/Downloads/Anaconda3-5.3.1-Linux-x86_64.sh

Anaconda默認採用的是 交互式 安裝,需要用戶參與,詳細說明見官方文檔 Installing on Linux

Anaconda同樣支持 無交互式 安裝,根據官方文檔

安裝時添加參數-b即可實現 靜默 安裝,如果需要自定義安裝路徑,添加參數-p,具體說明如下:

  1. -b: Batch mode with no PATH modifications to ~/.bashrc. Assumes that you agree to the license agreement. Does not edit the .bashrc or .bash_profile files.
  2. -p: Installation prefix/path.
  3. -f: Force installation even if prefix -p already exists.

此處以安裝路徑爲/opt/Anaconda爲例,安裝命令爲

1
2
installation_dir='/opt/Anaconda'
bash ~/Downloads/Anaconda3-5.3.1-Linux-x86_64.sh -b -f -p ${installation_dir}

$PATH

Anaconda安裝完成後,仍無法直接執行命令conda。原因是可執行文件路徑${installation_dir}/bin/不在環境變量$PATH中。需要通過文件${installation_dir}/bin/activate將其添加到$PATH中。

Anaconda給出的方案是在文件~/.bashrc中添加如下設置

1
export PATH="${installation_dir}/bin:$PATH"

但個人建議將其放置在目錄/etc/profile.d/中,這樣其他用戶也可以使用conda

執行如下命令進行更新

1
2
3
4
# 1 - in $PATH
conda update conda
# 2 - absolute path
/opt/Anaconda/conda update conda

Jupyter

Anaconda中已集成有Jupyter Notebook,稍作配置即可使用。

Jupyter Notebook默認監聽端口8888,採用密碼或token方式登入。

通過如下命令生成配置文件

1
jupyter notebook --generate-config

生成的配置文件路徑爲

1
~/.jupyter/jupyter_notebook_config.py

配置文檔 Configuration Overview

需要設置的主要有以下指令

  • NotebookApp.allow_root
  • NotebookApp.base_url
  • NotebookApp.ip
  • NotebookApp.port
  • c.NotebookApp.password
  • NotebookApp.allow_password_change
  • NotebookApp.notebook_dir
  • NotebookApp.certfile
  • NotebookApp.disable_check_xsrf

如果是遠程訪問,則需要在防火牆中開啓Jupyter端口(默認爲8888)。

生成密碼

Jupyter Notebook採用的是交互式方式生成Hash密碼。

交互模式

官方文檔 Alternatives to token authentication 提到

New in version 5.0: jupyter notebook password command is added.

生成的Hash密碼存儲路徑爲

1
~/.jupyter/jupyter_notebook_config.json

演示過程如下 (密碼[email protected]_Python)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
┌─[[email protected]][~]
└──╼ $jupyter notebook password
Enter password:
Verify password:
[NotebookPasswordApp] Wrote hashed password to /home/maxdsre/.jupyter/jupyter_notebook_config.json
┌─[[email protected]][~]
└──╼ $cat ~/.jupyter/jupyter_notebook_config.json
{
  "NotebookApp": {
    "password": "sha1:4b55a390f103:1e3bf1e48ca12fda8a0fad32799fc0ad82e0301c"
  }
}┌─[[email protected]][~]
└──╼ $

靜默模式

但交互式方式不利於Shell腳本的自動化操作, 能否實現非交互式生成密碼呢?

此處以Python 3爲例,Jupyter Notebook生成密碼使用到如下文件,裏面定義了相關函數,如passwdpasswd_checkset_passwordpersist_config

1
${installation_dir}/lib/python3.6/site-packages/notebook/auth/security.py

提取其中部分代碼,寫入文件/tmp/passwd.py,仍以密碼[email protected]_Python爲例。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
# For Python 3.6
import hashlib
import random
from ipython_genutils.py3compat import cast_bytes, str_to_bytes
from notebook.auth.security import passwd_check

salt_len = 12

def passwd(password,algorithm='sha1'):
    h = hashlib.new(algorithm)
    salt = ('%0' + str(salt_len) + 'x') % random.getrandbits(4 * salt_len)
    h.update(cast_bytes(password, 'utf-8') + str_to_bytes(salt, 'ascii'))
    return ':'.join((algorithm, salt, h.hexdigest()))

pass_str='[email protected]_Python'
result_str=passwd(pass_str)

# print hashed passwd
print(result_str)

# check if is legal
print(passwd_check(result_str,pass_str))

執行如下命令

1
jupyter run /tmp/test.py

操作過程

1
2
3
4
5
6
┌─[[email protected]][/tmp]
└──╼ $jupyter run /tmp/test.py
sha1:492e18c9b198:5c00c6ea49e8426766ffb698ced3827e579369c6
True
┌─[[email protected]][/tmp]
└──╼ $

可以看到驗證結果爲True

但這樣做會有一個問題:Anaconda同時支持Python 3 和 2,文件security.py有2份,需要分別處理;如果Jupyter Notebook更新了其中的代碼,則Shell腳本也須作出相應更改,本人無法確保能夠及時作出響應。基於此考慮,未將自定義密碼功能加入Shell腳本。

SSL證書

爲提高數據傳輸安全,可配置SSL證書,官方文檔見 Using SSL for encrypted communication

通過openssl創建自簽SSL證書,該過程是一個交互式過程。但可以通過指令-subj實現免交互操作。

Shell代碼示例

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
jupyter_name='Jupyter'
jupyter_conf_dir="/tmp/${jupyter_name}"

self_cert_path=${self_cert_path:-"${jupyter_conf_dir}/${jupyter_name}.pem"}
# openssl req -x509 -nodes -days 365 -newkey rsa:1024 -keyout mycert.pem -out mycert.pem
[[ -d "${jupyter_conf_dir}" ]] || mkdir -p "${jupyter_conf_dir}"
if [[ ! -f "${self_cert_path}" ]]; then
    cert_C=${cert_C:-'CN'}    # Country Name
    cert_ST=${cert_ST:-'Shanghai'}    # State or Province Name
    cert_L=${cert_L:-'Shanghai'}    # Locality Name
    cert_O=${cert_O:-'MaxdSre'}    # Organization Name
    cert_OU=${cert_OU:-'Python'}    # Organizational Unit Name
    cert_CN=${cert_CN:-'jupyter.org'}    # Common Name
    # https://jupyter.org/community
    cert_email=${cert_email:-'[email protected]'}    # Email Address

    cert_C="${ip_public_country_code}"
    cert_ST="${ip_public_locate%%.*}"
    cert_L="${ip_public_locate##*.}"

    # expire date 3650 days, crypt type RSA, key length 4096
    openssl req -x509 -nodes -days 3650 -newkey rsa:4096 -keyout "${self_cert_path}" -out "${self_cert_path}" -subj "/C=${cert_C}/ST=${cert_ST}/L=${cert_L}/O=${cert_O}/OU=${cert_OU}/CN=${cert_CN}/emailAddress=${cert_email}" 2> /dev/null
    # openssl rsa -in "${self_cert_path}" -text -noout 2> /dev/null
    [[ -f "${self_cert_path}" ]] && chmod 644 "${self_cert_path}"
fi

執行過程如下

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
# bash -x ssl.sh
+ jupyter_name=Jupyter
+ jupyter_conf_dir=/tmp/Jupyter
+ self_cert_path=/tmp/Jupyter/Jupyter.pem
+ [[ -d /tmp/Jupyter ]]
+ mkdir -p /tmp/Jupyter
+ [[ ! -f /tmp/Jupyter/Jupyter.pem ]]
+ cert_C=CN
+ cert_ST=Shanghai
+ cert_L=Shanghai
+ cert_O=MaxdSre
+ cert_OU=Python
+ cert_CN=jupyter.org
+ cert_email=[email protected]
+ cert_C=
+ cert_ST=
+ cert_L=
+ openssl req -x509 -nodes -days 3650 -newkey rsa:4096 -keyout /tmp/Jupyter/Jupyter.pem -out /tmp/Jupyter/Jupyter.pem -subj /C=/ST=/L=/O=MaxdSre/OU=Python/CN=jupyter.org/emailAddress=[email protected]
+ [[ -f /tmp/Jupyter/Jupyter.pem ]]
+ chmod 644 /tmp/Jupyter/Jupyter.pem

Jupyter 插件

爲增強Jupyter Notebook功能,可選擇安裝插件jupyter_contrib_nbextensions,官方文檔 Unofficial Jupyter Notebook Extensions

1
2
3
4
5
# Install the python package
conda install -c conda-forge jupyter_contrib_nbextensions

# Install javascript and css files
jupyter contrib nbextension install --user

測試

命令行

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
┌─[[email protected]][~]
└──╼ $jni a
Solving environment: done

# All requested packages already installed.

Finishing executing conda update --all for Anaconda!

┌─[[email protected]][~]
└──╼ $jnl
┌─[[email protected]][~]
└──╼ $jnb
https://127.0.0.1:33525/Jupyter/?token=2709e9966fe2772e00a76ebfddfc12ac3d544eef518c530c :: /home/maxdsre/Jupyter
┌─[[email protected]][~]
└──╼ $jnl
https://127.0.0.1:33525/Jupyter/?token=2709e9966fe2772e00a76ebfddfc12ac3d544eef518c530c :: /home/maxdsre/Jupyter
┌─[[email protected]][~]
└──╼ $jne
┌─[[email protected]][~]
└──╼ $jnl
┌─[[email protected]][~]
└──╼ $

Web 瀏覽器

私有 SSL

參考資料

更新日誌

  • 2018.04.19 10:42 Wed America/Boston
    • 初稿完成
  • 2018.07.11 11:38 Wed America/Boston
    • 更新版本至 5.2
  • 2018.11.29 10:45 Thu America/Boston
    • 更新版本至 5.3.1
顯示 Disqus 評論