Skip to main content

Project Manifest

A Project Manifest is a file that is managed as part of your source code. It defines the entire application stack of your Project so it can be automatically built, deployed, and operated on the Pergola platform.

The Manifest file is placed within the root folder of your source tree and can be either a YAML or a JSON file. For YAML, allowed filenames are pergola.yaml or pergola.yml. For JSON, it is pergola.json.

tip

You can validate your Manifest file locally, before even committing to Git.

Example Manifest file

version: v1

components:
- name: web-api
docker:
file: Dockerfile
build-context: . # default is .
build-args:
- name: ARG1
value: MY_VALUE1
- name: ARG2
value: MY_VALUE2
env:
# static env variable
- name: API_VERSION
value: v1
# variable from required config
- name: PUBLIC_FQDN_ROOT
config-ref: public_hostname
# variable from optional config with default
- name: LOG_LEVEL
config-ref: loglevel
value: info #default
- name: DB_HOST
component-ref: db
- name: DB_PORT
value: "5432"
- name: DB_USER
config-ref: db.user
- name: DB_PWD
config-ref: db.pwd
files:
# map file from required config
- path: /etc/web-api/config.json
config-ref: web-api-config.json
# map file providing inline (static) content
- path: /var/lib/web-api/static-content.xml
content: |
<?xml version="1.0" encoding="UTF-8"?>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
scaling:
min: 2
max: 5
ports:
- 8080
- 9091
ingresses:
- host: api
path: /v1
port: 8080

- name: db
docker:
image: postgres:16
ports:
- 5432
env:
- name: POSTGRES_PASSWORD
config-ref: postgres_initial_pwd
resources:
cpu: 500m
memory: 2Gi
storage:
- name: pgdata
path: /var/lib/postgresql/data
size: 50Gi

- name: job-regularly
docker:
file: job1/Dockerfile
command:
- "./run.sh"
args:
- "--db=$DB_HOST:$DB_PORT"
- "crunch"
env:
- name: DB_HOST
component-ref: db
- name: DB_PORT
value: "5432"
scheduled: "0 2 * * *"
identity: "data-processor"

- name: job-once
docker:
file: job2/Dockerfile
scheduled: "@release"

Version

Every Manifest file needs a version. Currently, only v1 is supported. Future releases of Pergola will support different flavours and capabilities of the Manifest file, distinguished by unique version numbers.

version: v1

Components

A Project needs at least one Component. A Component is the smallest unit which is built and runs on the Pergola platform. A typical Project usually consists of multiple Components, e.g. a dashboard, a database, scheduled job(s), etc., which all together form your application stack.

Per Component, you describe how to build and run it. The minimum required fields are:

  • name of your Component
  • docker that points to a Dockerfile or to an existing container image
components:
- name: dashboard
docker:
file: Dockerfile

The name must be unique within the Manifest. Allowed are lowercase letters (a-z), numbers (0-9) and hyphens (-), starts with a letter and cannot end with a hyphen.

Renaming Components

As each Component is identified by its unique name, renaming it consequently results in a "new" Component. Technically, any previously deployed instances of that Component with the old name will be shut down and new ones created under the new name.

This also has an impact on persistent data. Any Storage attached to a Component is exclusively owned by that Component, thus a "new" Component always starts with fresh data. However, any storage which was attached to the "old" Component will be retained and automatically reclaimed as soon as that Component is added back. This ensures that persistent data is not lost if you change the code branch or use different versions with a changed set of Components.

Docker

Within the docker field you either define the path to the Dockerfile to be built:

docker:
file: Dockerfile

whereas, file can also be a relative path to a specific Dockerfile within your code tree:

docker:
file: path/to/some_other_docker.file

Or, you reference an existing container image from a publicly available repository:

docker:
image: python:3.10.0-alpine

or fully qualified pointing to a specific container registry:

docker:
image: public.ecr.aws/pergola/postgres:16

Build context

The build-context is a relative path within your source tree that refers to the files and directories that will be available during build. Anything not included in the build context won’t be accessible to commands in your Dockerfile.

docker:
file: Dockerfile
build-context: path/to/context

In the example above, the Dockerfile is located at the root of your source tree and will see only the files under the path/to/context.

For example, a COPY file.txt /some/path/ instruction within the Dockerfile above will copy the path/to/context/file.txt from your source tree to /some/path/file.txt within the final container image.

Build arguments

build-args is a list of key-value pairs that are used as build-time variables:

docker:
file: Dockerfile
build-args:
- name: CLUSTER_ENABLED
value: "true"
- name: BUILD_FLAGS
value: "-z -o --raft"

Any variable defined here should be also declared (and used) in the referenced Dockerfile via the ARG instruction. These variables only exist during the build process and do not persist in the final image.

info

Only strings can be passed as values. For values like boolean or numbers make sure they are quoted as strings.

For example, this is wrong:

docker:
build-args:
- name: A_NUMBER
value: 1234
- name: A_BOOL
value: true

This is correct:

docker:
build-args:
- name: A_NUMBER
value: "1234"
- name: A_BOOL
value: "true"

Command and runtime arguments

If you want to override your Dockerfile's start command (ENTRYPOINT) and/or arguments (CMD), you can use the command and args fields in the Manifest:

command:
- "./run.sh"
args:
- "--db=$DB_HOST:$DB_PORT"
- "crunch"
  • command field specifies the actual command run by your Component
  • args field specifies the arguments passed to the command

When you specify command and/or args, these rules apply:

  • only strings are allowed; if you have a single number or a boolean (i.e. as args, or in very strange cases even a command), make sure they are quoted as strings, e.g. "1234" or "true" respectively
  • if you supply a command, regardless with or without args, any ENTRYPOINT or CMD defined in your Dockerfile are ignored; only the supplied command, and if provided the args, will start your Component
  • if you supply args only, any CMD defined in your Dockerfile is ignored; your Component will start with the ENTRYPOINT (if defined in your Dockerfile) and the args defined in your Manifest
  • if you do not supply command or args, the defaults (ENTRYPOINT and/or CMD) defined in your Dockerfile are used to start your Component

Examples:

ManifestDockerfileActual command run
command: ["/cmdM"]
args: ["foo", "bar"]
ENTRYPOINT ["/cmdD"]
CMD ["zoo", "boo"]
/cmdM foo bar
command: ["/cmdM"]ENTRYPOINT ["/cmdD"]
CMD ["zoo", "boo"]
/cmdM
args: ["foo", "bar"]ENTRYPOINT ["/cmdD"]
CMD ["zoo", "boo"]
/cmdD foo bar
<nothing specified>ENTRYPOINT ["/cmdD"]
CMD ["zoo", "boo"]
/cmdD zoo boo

Environment variables

Use env to provide environment variables to your Component at runtime.

Each environment variable has a name and a value. Only strings are allowed, for both. In case you have a number or a boolean as value, make sure they are quoted as strings, e.g. "1234" or "true" respectively.

Environment variables you provide here will override variables that are defined in the Dockerfile or the container image of your Component.

Literal values

A simple environment variable could be:

env:
- name: API_VERSION
value: v1

Configuration driven values

In most cases you don't want to hardcode the value but rather use a different one per Stage, e.g. log_level=debug on DEV, but log_level=info on LIVE environments. In order to achive that, you reference a Configuration entry via the config-ref which will be resolved to the actual value at Release based on the Stage you deploy to:

env:
- name: PUBLIC_FQDN_ROOT
config-ref: public_hostname

In the example above the value of PUBLIC_FQDN_ROOT will be resolved to a fully qualified domain name or host based on the configuration entry public_hostname, which is most probably different per Stage. For further details, see Configuration Management.

Any config-ref defined here must be resolvable at Release, means the Configuration selected must satisfy all referenced entries here, otherwise the Release will be rejected.

If you have a config-ref which might not be served by a Configuration entry (is optional), you can define a default value:

env:
- name: LOG_LEVEL
config-ref: loglevel
value: info

The LOG_LEVEL above will be either set by a Configuration, or will default to info if no Configuration entry defined.

Value derived from a Component's name

If you need a reference to another Component within the same Manifest, you can use component-ref and store its actual value (which is the runtime hostname of that Component) in an environment variable:

components:
- name: web-api
env:
- name: DB_HOST
component-ref: db
- name: db
docker:
image: postgres:16

The web-api above will have the hostname of the Component db stored in its environment variable DB_HOST. This requires that the referenced Component (db) has at least one port exposed.

Referencing other environment variables

Environment variables within same Component may reference each other, however ordering is important. Variables making use of others defined in the same Component must come later in the list. Similarly, avoid circular references.

env:
- name: REDIS_HOST_REF
component-ref: redis
- name: REDIS_CONNECT_URI
value: "redis://$(REDIS_HOST_REF):6379"

In the example above, the variable REDIS_HOST_REF is first resolved to the runtime hostname of the redis Component. Then the REDIS_CONNECT_URI references that hostname to construct the actual connection URI to Redis.

Files

Files can be mapped to desired paths within a Component, e.g. to serve environment specific configuration files, secrets or even binary files, as required at runtime.

Served from Configuration

A file can be served from a Configuration entry via the config-ref and mapped to its desired path within the Component:

files:
- path: /etc/web-api/config.json
config-ref: web-api-config.json

For further details, see Configuration Management.

Static content

It is also possible to provide a file inlined as a static content:

files:
- path: /var/lib/web-api/static-content.xml
content: |
<?xml version="1.0" encoding="UTF-8"?>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>

The content must be a string. In YAML you can also use multiline strings. In JSON you have to escape special characters and newlines accordingly.

In case you need a binary content, you can base64 encode its content and put it here. The Component is then responsible for decoding it back into its original binary format at runtime before using it.

Storage

A Storage is a volume attached to your Component. Data on a persistent storage will survive Component restarts and is also retained if you suspend the complete Stage. Whereas, data on an ephemeral storage is lost as soon as its Component is gone.

A storage is always exclusively owned by a Component, more precisely by an instance of a Component. If you are running multiple instances of your Component, see Scaling below, each of them will have its own volume attached. They are not shared across Components or instances of a Component.

Each storage requires:

  • a unique name
  • a path where to mount the volume within the Component
  • and a desired minimum size for the volume

With the optional attribute type you can choose between a persistent and an ephemeral storage:

  • standard: persistent disk storage for data that is retained between releases and restarts, e.g. for databases
  • temporary: ephemeral disk storage, e.g. for local caching; may be still around after a restart, but not guaranteed
  • memory: ephemeral in-memory (RAM) storage, e.g. for shared memory (like /dev/shm); definitely gone after a restart

If type is not specified, standard is the default. You cannot change the type of a storage once released.

storage:
# persistent storage for Postgres
- name: pgdata
path: /var/lib/postgresql/data
size: 50Gi
# more shared memory, more parallel queries
- name: dshm
path: /dev/shm
size: 2Gi
type: memory

The name must be unique within the Component. Allowed are lowercase letters (a-z), numbers (0-9) and hyphens (-), starts with a letter and cannot end with a hyphen.

The size is measured in bytes. You can express volume size using one of these quantity suffixes: M (mega), G (giga), T (tera), or in binary: Mi (mebi), Gi (gibi), Ti (tebi).

Examples

1G means 1 gigabyte which equals to 1000 megabytes, or 1000000 kilobytes.

1Gi means 1 gibibyte which equals to 1024 mebibytes, or 1048576 kibibytes.

See binary units for further details.

caution

The suffixes are case-sensitive, i.e. 1000M means 1000 megabytes, while 1000m would mean 1000 millibytes which equals to one byte.

For persistent storages (i.e. standard), you cannot shrink the size of an existing storage. If the actual size is larger than the one defined within the Manifest, e.g. due to previous deployments with a different volume size, the actual size takes precedence.

Ports

In order to provide access to your Component within your Project, you can define its ports to be exposed:

ports:
- 8080
- 9091

For further details, see the Linking Components tutorial.

Resources

For each Component, you can optionally specify how much resources it needs at runtime. The resources you can specify are cpu and/or memory:

resources:
cpu: 500m
memory: 2Gi

CPU

You can express cpu in total units, like "1", or in millis, like 100m. 1 CPU is equivalent to one virtual core. Whereas 100m is a fraction of a CPU and is equivalent to 0.1 (10% of a virtual core).

For total units, like 2 CPUs, make sure the value is quoted as string:

resources:
cpu: "2"
info

If you do not specify CPU requirements, your Component will run on a best-effort basis regarding CPU usage. Means, it can still utilize as much CPU as it needs and as available on the infrastructure, but it might be ousted by other Components with explicit CPU requirements when the underlying compute node is under CPU pressure.

Memory

memory is measured in bytes. You can express memory using one of these quantity suffixes: k (kilo), M (mega), G (giga), or in binary: Ki (kibi), Mi (mebi), Gi (gibi).

Examples

1G means 1 gigabyte which equals to 1000 megabytes, or 1000000 kilobytes.

1Gi means 1 gibibyte which equals to 1024 mebibytes, or 1048576 kibibytes.

See binary units for further details.

caution

The suffixes are case-sensitive, i.e. 1000M means 1000 megabytes, while 1000m would mean 1000 millibytes which equals to one byte.

info

If you do not specify memory requirements, your Component will run on a best-effort basis regarding memory usage. Means, it can still utilize as much memory as it needs and as available on the infrastructure, but it might be ousted by other Components with explicit memory requirements when the underlying compute node is under memory pressure.

Scaling

If you need more than one instance of your Component running you can define scaling boundaries. Depending on the usage of infrastructure resources it will dynamically scale within given boundaries. Given that, the scaling is only reliable if you also have defined proper resources.

scaling:
min: 2
max: 5

min represents the minimum amount of instances of your Component running for sure, and max is the maximum amount of instances which will be running when needed. The values are integers and must not be quoted.

Ingresses

An ingress exposes your Component to the web, via http/https. Pergola automatically provides a unique URL for each ingress of your Component which you can share with your users, or use in other (external) applications to access your Component's web API.

ingresses:
- host: api
path: /v1
port: 8080

host defines the host prefix for the URL to be generated. This field is mandatory.

path is optional and defaults to /. Defining a path can be useful if you want to distinquish different API versions, like /v1 vs. /v2, or when you need separate access points within your app, e.g. /dashboard vs. /admin, or when your Component just simply accepts incoming requests on a specific path only.

Uniqueness

The combination of host and path within a Manifest must be unique.

port defines the internal (target) port of your Component as defined under ports. This field is optional, if your Component exposes one port only. If your Component exposes multiple ports, then the specific one to be served via ingress must be declared here.

For further details, see the Web Component tutorial.

Scheduling

You can schedule Components, so they do not run all the time (daemon) but are executed on a schedule and are shut down once they run to completion.

Regularly

To run a Component on a regular basis, you provide the scheduled field with the desired configuration as a cron expression:

scheduled: "0 2 * * *"
note

All timings are in UTC.

For further details, see the Scheduled Components tutorial.

Once per Release

Components can be also scheduled to run for each Release:

scheduled: "@release"

This will trigger the Component to be executed exactly once, whenever there is a new Release. This is useful for example for database alterations, for notifying external workflows, etc.

For further details, see the Scheduled Components tutorial.

Automatic retries

When a scheduled Component fails, it will be automatically retried up to 3 times (default setting), so in total it will run max. 4 times before it is considered as failed.

You can specify the maximum amount of retries or disable it completely with the max-retries attribute:

# retry up to 5 times
- name: job-regularly
scheduled: "0 2 * * *"
max-retries: 5

# disable retries, run exactly once only
- name: job-once
scheduled: "@release"
max-retries: 0

Identity

An identity enables access from within a Pergola Component to cloud resources (e.g. Google BigQuery or AWS S3) via cloud provider’s native IAM entities (e.g. a GCP Service Account or an AWS IAM Role) at runtime.

identity: "data-processor"

The identity defined here is first of all a logical name. It does not grant access to cloud resources on its own. Only when you link this identity with an actual native cloud IAM at configuration time, your Component will inherit the cloud IAM permissions at next Release.

For further details, see Identity management.