Setting up Emacs for Data Analysis in Python

xkcd 1987 python environment

The untimely demise of my old Asus laptop provided me with an occasion to upgrade my laptop, and I decided to go with the Dell XPS 13 for 2020. I might have written a blog post about installing Fedora 33, but there’s really nothing to report. Following the instructions on the Fedora web site was all that was required, and everything works flawlessly. Battery consumption is not what it might be if Windows were installed, but the trade-off is more than acceptable. Perhaps I will write a follow-up on how the machine works day to day. Now that I’ve tinkered with Gnome enough to feel happy enough with it, the time has come to turn to a more formidable installation task: my Python development environment. That will be the topic of this post. I hope this is of use to someone; at the very least, it will serve as a reference for me next time I have to set up a new machine.

The key elements of my Python environment, used mainly for data analysis using Pandas and creating visualizations using Seaborn and Matplotlib, are as follows.

  • Emacs, with Elpy for entering and running code
  • Jupyter iPython server for executing code interactively within Emacs
  • Jupyter notebook, for pasting in code developed in Emacs, for sharing with others
  • pipenv and pyenv for managing Python environments.
  • pyvenv in Emacs, for an interface to virtual environments created with pyenv

I’ll walk through the steps I used to get all this up and running. I am heavily indebted to Daniel van Flymen (https://hackernoon.com/reaching-python-development-nirvana-bb5692adf30c), Gioele Barabucci (https://gioele.io/pyenv-pipenv), and Alfredo Motta (https://www.alfredo.motta.name/create-isolated-jupyter-ipython-kernels-with-pyenv-and-virtualenv/). Giole provides a compelling case for using pyvenv and pipenv for managing Python environments. Why not just use Anaconda? Well, why not, indeed. Perhaps this too will be addressed in a subsequent post, if I can find the time, between writing code and teaching.

Python environment setup: pyenv and pipenv

The first thing I did was to make sure the dependencies were installed, so as to use the automatic installer, as per https://github.com/pyenv/pyenv/wiki/Common-build-problems.

[adam@localhost]~% sudo dnf install zlib-devel bzip2 bzip2-devel readline-devel sqlite sqlite-devel openssl-devel xz xz-devel libffi-devel findutils

Then, as instructed at https://github.com/pyenv/pyenv-installer:

[adam@localhost]~% curl https://pyenv.run | bash

Then, to add the PYENV_ROOT shell variable:

[adam@localhost]~% echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.zshrc
[adam@localhost]~% echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.zshrc

Next:

[adam@localhost]~% echo -e 'if command -v pyenv 1>/dev/null 2>&1; then eval "$(pyenv init -)"\nfi' >> ~/.zshrc

(The latter two sets of instructions come from https://github.com/pyenv/pyenv#installation, steps 2 and 3 under “Basic GitHub Checkout.”)

Now it’s time to install Python.

[adam@localhost]~% pyenv install 3.8.6
Downloading Python-3.8.6.tar.xz...
-> https://www.python.org/ftp/python/3.8.6/Python-3.8.6.tar.xz
Installing Python-3.8.6...
Installed Python-3.8.6 to /home/adam/.pyenv/versions/3.8.6

[adam@localhost]~%

Of course, one can install other versions as well. This is in ~/.pyenv/versions/3.8.6. Everything is in the home directory, so as to be isolated from system installations of python.

Next, install pipenv with pip.

[adam@localhost]~% pip install pipenv

To create a virtual environment, such as one might use for a project, pyvenv-virtualenv is used; it’s installed along with pyvenv. That’s what’s used to create virtual environments. As indicated at https://github.com/pyenv/pyenv-virtualenv.

[adam@localhost]% echo 'eval "$(pyenv virtualenv-init -)"' >> ~/.zshrc

sets up automatic activation of virtual environments.

A virtual environment for Python 3.8.6 is created:

[adam@localhost]~/.pyenv/versions/3.8.6% pyenv virtualenv 3.8.6 adam-virtual-env-3.8.6

The next step is to activate the environment.

[adam@localhost]~/.pyenv/versions/3.8.6% pyenv activate adam-virtual-env-3.8.6

Gioele gives a nice account of how to use pipenv in a virtual environment (https://gioele.io/pyenv-pipenv; see above).

Emacs Python coding environment

I like to write code in Elpy. Besides all of the familiar Emacs editing tools I know and love, it has built-in syntax checking and code formatting, and code snippets can be run from within Emacs. Because so many people share and run their code in Jupyter notebooks, I usually have a notebook open, which I build at the same time as I am developing the code in Emacs. That way I can be sure that the notebooks will generate the expected results. The notebooks are run from within a virtual environment using the python version and packages I want. To further guarantee that the notebook and my code in Emacs are using the same interactive Python kernel, I set up a kernel for each virtual environment that can be used by both Emacs and a notebook.

The first step is to install jupyter in the virtual environment with pip. This includes everything needed. Then, As explained by Alfredo at https://www.alfredo.motta.name/create-isolated-jupyter-ipython-kernels-with-pyenv-and-virtualenv/, identify the data directories used by jupyter.

(adam-virtual-env-3.8.6) [adam@localhost]~% jupyter --paths
config:
    /home/adam/.jupyter
    /home/adam/.pyenv/versions/3.8.6/envs/adam-virtual-env-3.8.6/etc/jupyter
    /usr/local/etc/jupyter
    /etc/jupyter
data:
    /home/adam/.local/share/jupyter
    /home/adam/.pyenv/versions/3.8.6/envs/adam-virtual-env-3.8.6/share/jupyter
    /usr/local/share/jupyter
    /usr/share/jupyter
runtime:
    /home/adam/.local/share/jupyter/runtime
(adam-virtual-env-3.8.6) [adam@localhost]~%

The (adam-virtual-env-3.8.6) indicates that I’m using the virtual environment I created as described above.

Now we make a kernels directory in the first of the paths listed for jupyter’s data.

(adam-virtual-env-3.8.6) [adam@localhost]~% mkdir /home/adam/.local/share/jupyter/kernels
(adam-virtual-env-3.8.6) [adam@localhost]~% mkdir /home/adam/.local/share/jupyter/kernels/adam-virtual-env-3.8.6

Next, we need to find out where the python executable is in our virtual environment.

(adam-virtual-env-3.8.6) [adam@localhost]~% pyenv which python
/home/adam/.pyenv/versions/adam-virtual-env-3.8.6/bin/python

Now, we will add the kernel.json file to our new directory, putting in the path to the python executable, the first of the data directories discovered above, and name of the virtual environment.

{
    "argv": [
        "/home/adam/.pyenv/versions/adam-virtual-env-3.8.6/bin/python",
        "-m",
        "ipykernel_launcher",
        "-f",
        "{connection_file}"
    ],
    "display_name": "adam-virtual-env-3.8.6",
    "language": "python"
}

This kernel can be selected in a Jupyter notebook, and it’s the same one that will be used in Elpy configured as follows, based on the Elpy documentation at https://elpy.readthedocs.io/en/latest/ide.html#interactive-python.

(setq python-shell-interpreter "jupyter"
    python-shell-interpreter-args "console --simple-prompt"
    python-shell-prompt-detect-failure-warning nil
    python-shell-completion-native-enable nil)
;; (add-to-list 'python-shell-completion-native-disabled-interpreters
;; "jupyter")

The difference between this and what the docs recommend is that, in order to avoid generating an error having to do with readline and completion, the final two lines must be commented out, and python-shell-completion-native-enable must be set to nil. This bug is discussed at https://github.com/jorgenschaefer/elpy/issues/887. It appears that jupyter itself provides useful completion, and code entry will occur in the Elpy buffer anyhow, so there doesn’t seem to be any loss of function here.

Finally, to use the virtual environment in Elpy, I use

M-x pyvenv-activate [RET] ~/.pyenv/versions/adam-virtual-env-3.8.6/

When a python file is loaded into a buffer when Elpy is in use, the kernel specified for the virtual environment will be used for iPython.