All checks were successful
CI / skip-ci-check (pull_request) Successful in 1m18s
CI / lint-and-test (pull_request) Successful in 1m21s
CI / ansible-validation (pull_request) Successful in 2m43s
CI / secret-scanning (pull_request) Successful in 1m19s
CI / dependency-scan (pull_request) Successful in 1m23s
CI / sast-scan (pull_request) Successful in 2m28s
CI / license-check (pull_request) Successful in 1m20s
CI / vault-check (pull_request) Successful in 2m21s
CI / playbook-test (pull_request) Successful in 2m19s
CI / container-scan (pull_request) Successful in 1m48s
CI / sonar-analysis (pull_request) Successful in 1m26s
CI / workflow-summary (pull_request) Successful in 1m17s
Role: datascience
Description
Installs and configures a complete data science environment including Anaconda/Conda, Jupyter Notebook, and R statistical computing language. This role is optional and separate from general development tools for faster deployments when data science capabilities are not needed.
Requirements
- Ansible 2.9+
- Debian/Ubuntu systems
- At least 5GB free disk space (Anaconda is ~700MB)
- Root or sudo access
Installed Components
Anaconda/Conda
- Full Anaconda3 distribution (latest version)
- Python data science packages (pandas, numpy, matplotlib, scikit-learn)
- Conda package manager
- Initialized for bash (zsh initialization via shell role)
Jupyter Notebook
- Jupyter Notebook server
- IPython kernel
- JupyterLab interface
- Systemd service for automatic startup
- Web-based access on configurable port
R Language
- R base and development packages
- R recommended packages
- CRAN repository configuration
- IRkernel for Jupyter integration
- Common R packages
Variables
| Variable | Default | Description |
|---|---|---|
install_conda |
false |
Install Anaconda/Conda |
conda_install_path |
{{ ansible_env.HOME }}/anaconda3 |
Conda installation directory |
install_jupyter |
false |
Install Jupyter Notebook (requires conda) |
jupyter_port |
8888 |
Jupyter web server port |
jupyter_bind_all_interfaces |
true |
Listen on all network interfaces |
jupyter_allow_remote |
true |
Allow remote connections |
install_r |
false |
Install R language |
r_packages |
[r-base, r-base-dev, r-recommended] |
R packages to install |
Dependencies
baserole (for core utilities)
Example Playbook
Full Data Science Stack
- hosts: datascience_servers
roles:
- role: datascience
install_conda: true
install_jupyter: true
install_r: true
Conda + Jupyter Only
- hosts: jupyter_servers
roles:
- role: datascience
install_conda: true
install_jupyter: true
install_r: false
R Only (no Python)
- hosts: r_servers
roles:
- role: datascience
install_conda: false
install_r: true
Usage
Installation
# Install full data science stack
make datascience HOST=server01
# Install on specific host with custom vars
ansible-playbook playbooks/development.yml --limit server01 --tags datascience \
-e "install_conda=true install_jupyter=true install_r=true"
Post-Installation
Set Jupyter Password
jupyter notebook password
Access Jupyter
# Local
http://localhost:8888
# Remote
http://your-server-ip:8888
Verify Installations
conda --version
jupyter --version
R --version
Tags
datascience: All data science tasksconda: Conda/Anaconda installation onlyjupyter: Jupyter Notebook installation onlyr,rstats: R language installation only
Services
Jupyter Notebook Service
- Service name:
jupyter-notebook.service - Start:
systemctl start jupyter-notebook - Status:
systemctl status jupyter-notebook - Logs:
journalctl -u jupyter-notebook
Performance Notes
Installation Times
- Anaconda: 5-10 minutes (700MB download)
- Jupyter: 2-5 minutes (if Anaconda already installed)
- R: 10-30 minutes (compilation of packages)
Disk Space
- Anaconda: ~2.5GB after installation
- R + packages: ~500MB
- Total: ~3GB for full stack
Security Considerations
- Jupyter Password: Always set a password after installation
- Firewall: Ensure port 8888 (or custom port) is properly firewalled
- HTTPS: Consider using HTTPS for remote access
- User Isolation: Jupyter runs as the ansible user by default
Troubleshooting
Conda Not in PATH
# Add to .bashrc or .zshrc
export PATH="$HOME/anaconda3/bin:$PATH"
Jupyter Service Won't Start
# Check logs
journalctl -u jupyter-notebook -n 50
# Verify conda is accessible
/root/anaconda3/bin/jupyter --version
R Package Installation Fails
# Install manually in R
R
> install.packages("IRkernel")
Integration with Other Roles
- Shell Role: Provides zsh with conda integration
- Monitoring: btop/htop for resource monitoring
- Docker: Can run Jupyter in containers alternatively
Notes
- Anaconda installer is cleaned up after installation
- Conda init for zsh is handled by the shell role
- IRkernel is automatically installed if both Jupyter and R are enabled
- R packages are compiled during installation (can be slow)
- Jupyter service starts on boot automatically