Activating a Conda environment in your Dockerfile


The Conda packaging tool implements environments, that enable different applications to have different libraries installed.
So when you’re building a Docker image for a Conda-based application, you’ll need to activate a Conda environment.

Unfortunately, activating Conda environments is a bit complex, and interacts badly with the way Dockerfiles works.

So how do you activate a Conda environment in a Dockerfile?

For educational purposes I’m going to start with explaining the problem and showing some solutions that won’t work, but if you want you can just skip straight to the working solution.

The problem with conda activate

Conda environments provide a form of isolation: each environment has its own set of C libraries, Python libraries, binaries, and so on.
Conda installs a base environment where it itself is installed, so to use a Conda-based application you need to create and then activate a new, application-specific environment.

Specifically, to activate a Conda environment, you usually run conda activate.
So let’s try that as our first attempt, and see how it fails.

We’ll start with an environment.yml file defining the Conda environment:

name: myenv
channels:
  - conda-forge
dependencies:
  - python=3.8
  - flask

And a small Python program, run.py:

import flask

print("It worked!")

A first attempt at a Dockerfile might look as follows:

FROM continuumio/miniconda3

WORKDIR /app

# Create the environment:
COPY environment.yml .
RUN conda env create -f environment.yml

# Activate the environment, and make sure it's activated:
RUN conda activate myenv
RUN echo "Make sure flask is installed:"
RUN python -c "import flask"

# The code to run when container is started:
COPY run.py .
ENTRYPOINT ["python", "run.py"]

If we build the resulting Docker image, here’s what happens:

$ docker build .
...
Step 5/9 : RUN conda activate myenv
 ---> Running in aa2d7da176d0

CommandNotFoundError: Your shell has not been properly configured to use 'conda activate'.

Why emulating activation won’t work

Can you avoid using conda activate, and just set a few environment variables?
Probably not.

Unlike the activate script for the Python virtualenv tool, which just sets an environment variable or two, the Conda activation can also activate environment variables set by packages.

That means you can’t just emulate it, you need to use Conda’s own activation infrastructure.
So how are we going to do that?

A failed solution, part #1: conda init

Can we make conda activate work by doing the requested conda init?
It’s a start, but it won’t suffice.

conda init bash will install some startup commands for bash that will enable conda activate to work, but that setup code only runs if you have a login bash shell.
When you do:

What’s actually happening is that Docker is doing /bin/sh -c "conda activate env".
But, you can override the default shell with a SHELL command.

That plus conda init bash give us the following Dockerfile:

FROM continuumio/miniconda3

WORKDIR /app

# Make RUN commands use `bash --login`:
SHELL ["/bin/bash", "--login", "-c"]

# Create the environment:
COPY environment.yml .
RUN conda env create -f environment.yml

# Initialize conda in bash config fiiles:
RUN conda init bash

# Activate the environment, and make sure it's activated:
RUN conda activate myenv
RUN echo "Make sure flask is installed:"
RUN python -c "import flask"

# The code to run when container is started:
COPY run.py .
ENTRYPOINT ["python", "run.py"]

Will this work?
No it won’t:

$ docker build .
...
Step 9/11 : RUN python -c "import flask"
 ---> Running in adcecb020043
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'flask'

The problem is that each RUN in a Dockerfile is a separate run of bash.
So when you do:

RUN conda activate myenv
RUN echo "Make sure flask is installed:"
RUN python -c "import flask"

That just activates it for the first RUN, and the later RUNs are new shell sessions without activation happening.

A failed solution, part #2: Activate automatically

We want every RUN command to be activated, so we add conda activate to the ~/.bashrc of the current user:

FROM continuumio/miniconda3

WORKDIR /app

# Make RUN commands use `bash --login`:
SHELL ["/bin/bash", "--login", "-c"]

# Create the environment:
COPY environment.yml .
RUN conda env create -f environment.yml

# Initialize conda in bash config fiiles:
RUN conda init bash

# Activate the environment, and make sure it's activated:
RUN echo "conda activate myenv" > ~/.bashrc
RUN echo "Make sure flask is installed:"
RUN python -c "import flask"

# The code to run when container is started:
COPY run.py .
ENTRYPOINT ["python", "run.py"]

And now the image builds!

$ docker build -t condatest .
...
Successfully tagged condatest:latest

We’re not done yet, though.
If we run the image:

$ docker run condatest
Traceback (most recent call last):
  File "run.py", line 1, in <module>
    import flask
ModuleNotFoundError: No module named 'flask'

The problem is that the syntax we used for ENTRYPOINT doesn’t actually start a shell session.
Now, instead of doing ENTRYPOINT ["python", "run.py"], you can actually have ENTRYPOINT use a shell with this alternative syntax:

The problem with this syntax is that it breaks container shutdown, so you probably don’t want to use it.

A working solution

Instead of using conda activate, there’s another way to run a command inside an environment.
conda run -n myenv yourcommand will run yourcommand inside the environment.

So that suggests the following Dockerfile:

FROM continuumio/miniconda3

WORKDIR /app

# Create the environment:
COPY environment.yml .
RUN conda env create -f environment.yml

# Make RUN commands use the new environment:
SHELL ["conda", "run", "-n", "myenv", "/bin/bash", "-c"]

# Make sure the environment is activated:
RUN echo "Make sure flask is installed:"
RUN python -c "import flask"

# The code to run when container is started:
COPY run.py .
ENTRYPOINT ["conda", "run", "-n", "myenv", "python", "run.py"]

And indeed:

$ docker build -t condatest .
...
Successfully tagged condatest:latest
$ docker run condatest
It worked!



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *