asked on
how to download code via git inside docker container?
i have a sample docker file , that I want to use build containers for my data science project ( sample below) . so far this works, but now the code i want to run is in gitlab and I would like to download the code and run it in this container. I am looking for examples on how to do it and have few questions around it.
1. for my local machine, the ssh keys are set up with my gitlab account, so for the purpose of testing i should be able to pull down the code, but i need suggestions on after i download the code , do i just dump it to working directory , what if there are some other artifacts, data that needs to be downloaded as well, where should i keep it.
2. also testing locally, my code reads csv file , where should i put the csv file. i know docker has temp storage that we can set . i would assume that is the way to go, to set up a temp storage and then just upload the files to a mounted directory or something similar.
3. for the gitlab , it will work on my local machine , but if i want to run this in say aws or some cloud environment, it won't have the ssh keys set up with that cloud instance/machine and gitlab account. how are these things done usually?
```
FROM python:3.6-alpine
WORKDIR /var/www/
//pull code from git
# SOFTWARE PACKAGES
# * musl: standard C library
# * lib6-compat: compatibility libraries for glibc
# * linux-headers: commonly needed, and an unusual package name from Alpine.
# * build-base: used so we include the basic development packages (gcc)
# * bash: -- /bin/bash
# * git: to ease up clones of repos
# * ca-certificates: for SSL verification during Pip and easy_install
# * tcl: scripting language
# * libssl1.0: SSL shared libraries
ENV SOFTWARE_PACKAGES ="\
dumb-init \
musl \
libc6-compat \
linux-headers \
build-base \
bash \
git \
ca-certificates \
tcl \
libssl1.0 \
"
# PYTHON DATA SCIENCE PACKAGES
ENV PYTHON_PACKAGES="\
numpy \
matplotlib \
pandas \
"
RUN apk add --no-cache --virtual build-dependencies python3 \
&& ln -s /usr/include/locale.h /usr/include/xlocale.h \
&& python3 -m ensurepip \
&& pip install --no-cache-dir $PYTHON_PACKAGES \
## GCC and Utilities
&& apk del build-runtime \
&& apk add --no-cache --virtual build-dependencies $SOFTWARE_PACKAGES \ && rm -rf /var/cache/apk/*
CMD ["python3"]
```
stating that need means imho you probably do not want or need to use docker at all.
it might proove simpler to craft a script that just clones and runs the software or restart whichever daemon whenever a new version is available.
docker is not meant to store data inside containers or run code from outside the container. if you want to do either of those things, you probably should reflect on your perception of the technology and change the way you craft the containers or switch to actual vms.
ASKER
1: the docker way : create a container "panda" with panda only inside in your docker repository. build your other containers starting with "from panda"
2: store panda locally before you build. possibly wrap the build code with sole custom script that checks for a new panda version.
3: use a caching proxy
#1 is probably the most docker-ish way. you can build different base versions for each panda version and name them panda:version
if you use from panda, docker will pick panda:latest and resolve that adequately
ASKER