GitHub - pop-os/tensorman: Utility for easy management of Tensorflow containers
Extracto
Utility for easy management of Tensorflow containers - GitHub - pop-os/tensorman: Utility for easy management of Tensorflow containers
Resumen
Resumen Principal
Tensorman es una utilidad especializada alojada en GitHub que facilita la gestión eficiente de contenedores de Tensorflow, diseñada específicamente para optimizar flujos de trabajo de aprendizaje automático. Esta herramienta simplifica operaciones complejas relacionadas con el despliegue y administración de entornos Tensorflow mediante contenedores, lo que representa una ventaja significativa para desarrolladores y científicos de datos que requieren entornos reproducibles y configurables. Al automatizar tareas repetitivas y estandarizar procesos de contenedor, Tensorman mejora la productividad y reduce la fricción operativa en proyectos de machine learning. Su integración con tecnologías de contenedores modernas permite una gestión más ágil de recursos computacionales, facilitando el cambio entre diferentes versiones de Tensorflow y configuraciones de entorno sin intervención manual extensiva. La herramienta se posiciona como un componente clave para usuarios que buscan agilidad, consistencia y escalabilidad en sus flujos de trabajo basados en Tensorflow, especialmente en entornos de desarrollo iterativo y despliegues en múltiples plataformas.
Elementos Clave
- Gestión automatizada de contenedores Tensorflow: Tensorman automatiza la creación, configuración y ejecución de contenedores Tensorflow, eliminando pasos manuales y reduciendo errores de configuración en entornos de desarrollo de machine learning.
- Compatibilidad con múltiples versiones de Tensorflow: La herramienta permite alternar fácilmente entre distintas versiones de Tensorflow, facilitando pruebas comparativas, migraciones y soporte para proyectos con requisitos específicos de versionado.
- Interfaz simplificada para usuarios avanzados y principiantes: Diseñada con comandos intuitivos, Tensorman ofrece una curva de aprendizaje baja sin sacrificar funcionalidades avanzadas, ideal para equipos multidisciplinarios que trabajan con contenedores.
- Integración con sistemas de contenedores modernos: La utilidad está optimizada para trabajar con tecnologías como Docker, asegurando compatibilidad con infraestructuras existentes y facilitando la adopción en entornos de producción y desarrollo.
Análisis e Implicaciones
La existencia de herramientas como Tensorman refuerza la tendencia hacia la democratización del machine learning, al reducir la complejidad técnica asociada con la gestión de entornos especializados. Esto tiene un impacto directo en la velocidad de desarrollo y en la reproducibilidad de experimentos, factores críticos en investigación y aplicaciones empresariales. Además, su enfoque en contenedores promueve prácticas de desarrollo moderno y portabilidad, esenciales en entornos de despliegue híbrido o multi-nube.
Contexto Adicional
Desarrollada bajo el ecosistema Pop!_OS, una distribución Linux orientada a creadores y desarrolladores, Tensorman refleja un compromiso con herramientas de código abierto que potencian la productividad técnica. Su disponibilidad en GitHub permite contribuciones comunitarias, asegurando actualizaciones continuas y adaptación a nuevas necesidades del ecosistema de inteligencia artificial.
Contenido
Tensorflow Container Manager
Packaging Tensorflow for Linux distributions is notoriously difficult, if not impossible. Every release of Tensorflow is accommodated by a myriad of possible build configurations, which requires building many variants of Tensorflow for each Tensorflow release. To make matters worse, each new version of Tensorflow will depend on a wide number of shared dependencies which may not be supported on older versions of a Linux distribution that is still actively supported by the distribution maintainers.
To solve this problem, the Tensorflow project provides official Docker container builds, which allows Tensorflow to operate in an isolated environment that is contained from the rest of the system. This virtual environment can operate independent of the base system, allowing you to use any version of Tensorflow on any version of a Linux distribution that supports the Docker runtime.
However, configuring and managing Docker containers for Tensorflow using the docker command line is currently tedious, and managing multiple versions for different projects is even moreso. To solve this problem for our users, we have developed tensorman as a convenient tool to manage the installation and execution of Tensorflow Docker containers. It condenses the command-line soup into a set of simple commands that are easy to memorize.
Comparison to Docker Command
Take the following Docker invocation as an example:
docker run -u $UID:$UID -v $PWD:/project -w /project \
--runtime=nvidia --it --rm tensorflow/tensorflow:latest-gpu \
python ./script.py
This designates for the latest version of Tensorflow with GPU support to be used, mounting the working directory to /project, launching the container with the current user account, and and executing script.py with the Python binary in the container. With tensorman, we can achieve the same with:
tensorman run --gpu python -- ./script.py
Which defaults to the latest version, and whose version and tag variants can be set as defaults per-run, per-project, or user-wide.
Installing/Updating Containers
By default, docker will automatically install a container when running a container that it is not already installed. However, if you would like to install a container beforehand, you may do so using the pull subcommand.
tensorman pull 1.14.0
tensorman pull latest
Running commands in containers
The run subcommand allows you to execute a command from within the container. This could be the bash shell, for an interactive session inside the container, or the program / compiler which you wish to run.
# Default container version with Bash prompt
tensorman run bash
# Default container version with Python script
tensorman run python -- script.py
# Default container version with GPU support
tensorman run --gpu bash
# With GPU, Python3, and Juypyter support
tensorman run --gpu --python3 --jupyter bash
Setting the container version
Taking inspiration from rustup, there are methods to set the container version per-run, per-project, and per-user. The per-run version always takes priority over a per-project definition, which takes priority over the per-user configuration.
Setting per-run
If a version is specified following a + argument, tensorman will prefer this version.
tensorman +1.14.0 run --python3 --gpu bash
Custom images may be specified with a = argument.
tensorman =custom-image run --gpu bash
Setting per-project
There are two files that can be used for configuring Tensorman locally: tensorflow-toolchain, and Tensorman.toml. These files will be automatically detected if they can be found in a parent directory.
tensorflow-toolchain
This file overrides the tensorflow image, defined either in Tensorman.toml, or the user-wide configuration file.
Or specifying a custom image:
Tensorman.toml
This file supports additional configuration parameters, with a user-wide configuration located at ~/.config/tensorman/config.toml, and a project-wide location at Tensorman.toml. One of the reasons in which you may want to use this file is to declare some additional Docker flags, with the docker_flags key.
Using a default tensorflow image:
docker_flags = [ '-p', '8080:8080' ] tag = '2.0.0' variants = ['gpu', 'python3']
Defining a custom image:
docker_flags = [ '-p', '8080:8080' ] image = 'custom-image' variants = ['gpu']
One useful docker flag is the -v flag, that can be used at runtime to mount other directories not included in your image. The syntax for the argument of -v is source:destination. For example, if you have a large dataset in your home directory that you don't want to include as part of your image, you can mount it at runtime by addding the following line to your config.toml file:
docker_flags = [ '-v', '/home/<username>/<dataset>:/home/<username>/<dataset>' ]
Setting per-user
you can set a default version user-wide using the default subcommand. This version of Tensorflow will be launched whenever you use the tensorman run command.
tensorman default 1.14.0
tensorman default latest gpu python3
tensorman default nightly
By default,
tensormanwill uselatestas the default per-user version tag.
Showing the active container version
If you would like to know which container will be used when launched from the current working directory, you can use the show command.
Removing container images
Having many containers installed simultaneously on the same system can quickly use a lot of disk storage. If you find yourself in need of culling the containers installed on your system, you may do so with the remove command.
tensorman remove 1.14.0
tensorman remove latest
tensorman remove 481cb7ea88260404
tensorman remove custom-image
Listing installed container images
To aid in discovering what containers are installed on the system, the list subcommand is available.
Creating a custom image
In most projects, you will need to pull in more dependencies than the base Tensorflow image has. To do this, you will need to create the image by running a tensorflow container as root, installing and setting up the environment how you need it, and then saving those changes as a new custom image.
To do so, you will need to build the container in one terminal, and save it from another.
Build new image
First launch a terminal where you will begin configuring the docker image:
tensorman run --gpu --python3 --root --name CONTAINER_NAME bash
Once you've made the changes needed, open another terminal and save it as a new image:
tensorman save CONTAINER_NAME IMAGE_NAME
Running the custom image
You should then be able to specify that container with tensorman, like so:
tensorman =IMAGE_NAME run --gpu bash
The
--python3and--jupyterflags do nothing for custom containers, but--gpuis required to enable runtime support for the GPU.
Removing the custom image
Images saved through tensorman are manageable through tensorman. Listing and removing works the same:
tensorman remove IMAGE_NAME
License
Licensed under the GNU General Public License, Version 3.0, (LICENSE or https://www.gnu.org/licenses/gpl-3.0.en.html)
Contribution
Any contribution intentionally submitted for inclusion in the work by you, shall be licensed under the GNU GPLv3.
Fuente: GitHub