Profiling with Sciagraph: the basics
In order to use Sciagraph to profile your code, you need to ensure:
- The
sciagraph
package is installed. - You have an access token stored on disk or set via environment variables.
- Your program is run with Sciagraph enabled.
There are also custom integrations for different frameworks, including Jupyter, Celery and MLFlow, with more coming soon.
One-time step: Sign up for an account
In order to use Sciagraph, you will need an account with the Sciagraph service. This will give you the access key for the next step.
To get an access key, sign up for a free or paid Sciagraph account.
Step 1: Making sure sciagraph
is installed
The sciagraph
package can be installed normally from PyPI.
Make sure you’re using a recent version of pip
by upgrading it first; you can easily upgrade pip
by running inside a virtualenv.
pip install --upgrade pip
pip install sciagraph
Given it’s just a normal PyPI package, you can add sciagraph
as just another dependency to your application, by adding it to the relevant dependency list for your application:
requirements.txt
setup.py
Pipfile
(if you’re using Pipenv)pyproject.toml
(if you’re using Poetry or Flit)environment.yml
(if you’re using Conda)
Conda packages are not available yet.
Step 2: Making sure an access token is available
In order to validate that you are a licensed user of Sciagraph (on a free or paid plan), you need to setup the access token. You can use a configuration file, or an environment variables.
Option #1: Storing the access token in a file
If you visit your account page, it will include a command to run that will store the access token in a config file on disk. It will look something like this:
$ python -m sciagraph.store_token ...
This is the recommended option when profiling during development, because you only have to do it once.
Option #2: Setting the access token using environment variables
If you don’t use a config file, you need to set two environment variables wherever your program is running: SCIAGRAPH_ACCESS_KEY
and SCIAGRAPH_ACCESS_SECRET
.
You need to get these two environment variables from your account page.
In shell scripts you can just set these with an export
command:
export SCIAGRAPH_ACCESS_KEY=...
export SCIAGRAPH_ACCESS_SECRET=...
Setting environment variables in containers
Container runtimes typically have a way to set environment variables. For example:
- Docker Compose files let you set environment variables in a variety of ways, for example from
.env
files. - Kubernetes lets you set environment variables from secrets.
Please reach out if you need help.
Step 3: Run your program with Sciagraph enabled
By default Sciagraph profiles the whole process, from start to finish, which we’ll cover here. Alternatively, you can also run multiple jobs in a single process.
Let’s say your program is typically run like this:
$ python yourprogram.py --load=data/ --twiddle=2.718
There are two ways you can your program with Sciagraph.
Option #1: Running your program with python -m sciagraph
Instead of running your program as above, you can run it with python -m sciagraph run
:
$ python -m sciagraph run yourprogram.py --load=data/ --twiddle=2.718
This launches a new Python subprocess, and that is what actually runs your code.
Any arguments after run
are passed to the new Python interpreter.
So if your program is typically run like this:
$ python -m yourpackage arg1 arg2
You can run it with Sciagraph like so:
$ python -m sciagraph run -m yourpackage arg1 arg2
Option #2: Automatically profile all Python commands
In some cases you can’t use python -m sciagraph
, or you may want to automatically profile all Python programs you run.
You can do so by setting an environment variable:
$ export SCIAGRAPH_MODE=process
$ python yourprogram.py --load=data/ --twiddle=2.718
The Python program above will be automatically profiled using Sciagraph, because that environment variable is set.
Optional: Configure where reports are written
After you’ve profiled a program, a profiling report will be written out to disk.
The default location for reports is a new directory of the form sciagraph-result/<timestamp>
in the process’ current working directory.
The location the report is written will be output in a message at the end of the run.
Depending on how you run Sciagraph, you can also configure a custom storage location.
Customizing, option #1: CLI
If you’re using the CLI, you can override the destination with the --output-path
option:
$ python -m sciagraph --output-path ./profiling-report run yourprogram.py
Customizing, option #2: Environment variables
If you’re running Sciagraph with SCIAGRAPH_MODE=process
, you can set the SCIAGRAPH_OUTPUT_PATH
environment variable to customize where reports are stored.
$ export SCIAGRAPH_MODE=process
$ export SCIAGRAPH_OUTPUT_PATH=./profiling-report
$ python yourprogram.py
Your report will now be stored in the ./profiling-report
directory.
Optional: Automatically opening profiling reports in a browser
In process
mode, in GUI environments, Sciagraph will automatically open reports in a browser.
You can override this with the --open-browser
option, e.g.:
$ python -m sciagraph --open-browser=no run yourprogram.py
Possible values are no
(never open in a browser), yes
(always open), and auto
(the default).
You can also control this with the SCIAGRAPH_OPEN_BROWSER
environment variable, which takes the same options.