Understanding OCI Image Spec
Cloud native apps are packaged as OCI container images. Understanding OCI image spec plays an important role in building and optimizing container images.
Before talking about the details of OCI image spec, we'll discuss content-addressable storage first.
Content-addressable Storage
The storage we are using everyday are typically location-addressed. Locations are used to find directories and files. For example, the path /etc/nginx/nginx.conf
points to a file on the storage. Location-addressed storage is easy to understand and use. However, the content addressed by a location may change at any time. When the content is accessed from the same location in different times, the content may be different.
The biggest advantage of container images is that they are immutable. This means we cannot use location-addressed storage. Content-addressable storage is used instead. For each file, we can calculate its digest using hash algorithms. This digest is used to locate this file. SHA-256 and SHA-512 are commonly-used hashing algorithms.
When the file content is changed, its digest will also change. With the same digest, we can always find the same file. We can think the storage as a giant hash map, with digests as the keys and file content as the values.
Tools
We need to install skopeo and jq. On Ubuntu, these two tools can be installed as below.
$ sudo apt-get update
$ sudo apt-get -y install skopeo
$ sudo apt-get -y install jq
Copy Images
Nginx image is used as the example. We use skopeo copy
command to copy Nginx image into OCI format. The source of skopeo copy
is docker://nginx
, which means Nginx image with tag latest
on Docker Hub. The target oci:local_nginx
has a prefix oci:
, which means OCI image format. local_nginx
is the directory name to save the image content.
$ skopeo copy docker://nginx oci:local_nginx
Below is the output of this command.
Getting image source signatures
Copying blob eff15d958d66 done
Copying blob 1e5351450a59 done
Copying blob 2df63e6ce2be done
Copying blob 9171c7ae368c done
Copying blob 020f975acd28 done
Copying blob 266f639b35ad done
Copying config 9d446b871e done
Writing manifest to image destination
Storing signatures
After running this command, the content of Nginx image is copied to local file system.
Content of Image
Let's go to the local_nginx
directory and use tree
command to view its content.
$ tree --du .
In the directory, there are two files oci-layout
and index.json
. The directory blobs
contains the actual content of files. Content-addressable storage is used here. The directory name sha256
means the hashing algorithm used. The file names are digests.
.
├── [ 56737397] blobs
│ └── [ 56733301] sha256
│ ├── [ 668] 020f975acd28936c7ff43827238aed4771d14235dc983389ec149811f7e0b7cf
│ ├── [ 25347687] 1e5351450a593c3a3d7a5104f93c8b80d8dc00c827158cb3a5bf985916ea3f75
│ ├── [ 1394] 266f639b35ad602ee76c3b4d4cf88285a50adf8f561d8d96d331db732fe16982
│ ├── [ 602] 2df63e6ce2be0b3cefd3e659558e92b8085f032db96828343ec9cf0b7d4409fe
│ ├── [ 895] 9171c7ae368c6ca24dae913fce356801f624f656360c78ca956a92c3f0fe0ec7
│ ├── [ 6566] 9d446b871e5882110acf8dc0ab827425b8d25184f9426b12b2073186a0b2cdce
│ ├── [ 1126] b77780a5c0973c290799dea52ccbc975f61954907de8108d6f99e65a44fa7623
│ └── [ 31370267] eff15d958d664f0874d16aee393cc44387031ee0a68ef8542d0056c747f378e8
├── [ 187] index.json
└── [ 31] oci-layout
56741711 bytes used in 2 directories, 10 files
The file oci-layout
is the placeholder for OCI image layout. It has the following content:
{"imageLayoutVersion": "1.0.0"}
Image Index
The file index.json
is the OCI image index file, see OCI Image Index Specification. The media type is application/vnd.oci.image.index.v1+json
.
View the content of this file and format with jq
.
$ cat index.json | jq
The content is shown as below:
{
"schemaVersion": 2,
"manifests": [
{
"mediaType": "application/vnd.oci.image.manifest.v1+json",
"digest": "sha256:b77780a5c0973c290799dea52ccbc975f61954907de8108d6f99e65a44fa7623",
"size": 1126
}
]
}
The manifests
array contains references to multiple image manifests. Each manifest represents one CPU architecture. Here we have only one manifest. For each manifest, digest
means the digest of its content. The actual content can be found in the blobs
directory.
Image Manifest
Image manifests conform to OCI Image Manifest Specification. The media type is application/vnd.oci.image.manifest.v1+json
.
After transforming manifest digest into a path, we can view the content of the manifest. The digest sha256:b77780a5c0973c290799dea52ccbc975f61954907de8108d6f99e65a44fa7623
is transformed into the path blobs/sha256/b77780a5c0973c290799dea52ccbc975f61954907de8108d6f99e65a44fa762
.
$ cat blobs/sha256/b77780a5c0973c290799dea52ccbc975f61954907de8108d6f99e65a44fa7623 | jq
Below is the content of this manifest:
{
"schemaVersion": 2,
"config": {
"mediaType": "application/vnd.oci.image.config.v1+json",
"digest": "sha256:9d446b871e5882110acf8dc0ab827425b8d25184f9426b12b2073186a0b2cdce",
"size": 6566
},
"layers": [
{
"mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
"digest": "sha256:eff15d958d664f0874d16aee393cc44387031ee0a68ef8542d0056c747f378e8",
"size": 31370267
},
{
"mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
"digest": "sha256:1e5351450a593c3a3d7a5104f93c8b80d8dc00c827158cb3a5bf985916ea3f75",
"size": 25347687
},
{
"mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
"digest": "sha256:2df63e6ce2be0b3cefd3e659558e92b8085f032db96828343ec9cf0b7d4409fe",
"size": 602
},
{
"mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
"digest": "sha256:9171c7ae368c6ca24dae913fce356801f624f656360c78ca956a92c3f0fe0ec7",
"size": 895
},
{
"mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
"digest": "sha256:020f975acd28936c7ff43827238aed4771d14235dc983389ec149811f7e0b7cf",
"size": 668
},
{
"mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
"digest": "sha256:266f639b35ad602ee76c3b4d4cf88285a50adf8f561d8d96d331db732fe16982",
"size": 1394
}
]
}
In the manifest, config
represents configurations to run this image, layers
represents layers in the image.
Image Configurations
Image configurations conform with OCI Image Configuration. The media type is application/vnd.oci.image.config.v1+json
.
We can also view its content.
$ cat blobs/sha256/9d446b871e5882110acf8dc0ab827425b8d25184f9426b12b2073186a0b2cdce | jq
Below is the full content:
{
"created": "2021-11-17T10:38:14.652464384Z",
"architecture": "amd64",
"os": "linux",
"config": {
"ExposedPorts": {
"80/tcp": {}
},
"Env": [
"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
"NGINX_VERSION=1.21.4",
"NJS_VERSION=0.7.0",
"PKG_RELEASE=1~bullseye"
],
"Entrypoint": [
"/docker-entrypoint.sh"
],
"Cmd": [
"nginx",
"-g",
"daemon off;"
],
"Labels": {
"maintainer": "NGINX Docker Maintainers <docker-maint@nginx.com>"
},
"StopSignal": "SIGQUIT"
},
"rootfs": {
"type": "layers",
"diff_ids": [
"sha256:e1bbcf243d0e7387fbfe5116a485426f90d3ddeb0b1738dca4e3502b6743b325",
"sha256:37380c5830feb5d6829188be41a4ea0654eb5c4632f03ef093ecc182acf40e8a",
"sha256:ff4c727794302b5a0ee4dadfaac8d1233950ce9a07d76eb3b498efa70b7517e4",
"sha256:49eeddd2150fbd14433ec1f01dbf6b23ea6cf581a50635554826ad93ce040b68",
"sha256:1e8ad06c81b6baf629988756d90fd27c14285da4d9bf57179570febddc492087",
"sha256:8525cde30b227bb5b03deb41bda41deb85d740b834be61a69ead59d840f07c13"
]
},
"history": [
{
"created": "2021-11-17T02:20:41.91188934Z",
"created_by": "/bin/sh -c #(nop) ADD file:a2405ebb9892d98be2eb585f6121864d12b3fd983ebf15e5f0b7486e106a79c6 in / "
},
{
"created": "2021-11-17T02:20:42.315994925Z",
"created_by": "/bin/sh -c #(nop) CMD [\"bash\"]",
"empty_layer": true
},
{
"created": "2021-11-17T10:37:39.564148274Z",
"created_by": "/bin/sh -c #(nop) LABEL maintainer=NGINX Docker Maintainers <docker-maint@nginx.com>",
"empty_layer": true
},
{
"created": "2021-11-17T10:37:39.941485145Z",
"created_by": "/bin/sh -c #(nop) ENV NGINX_VERSION=1.21.4",
"empty_layer": true
},
{
"created": "2021-11-17T10:37:40.256097748Z",
"created_by": "/bin/sh -c #(nop) ENV NJS_VERSION=0.7.0",
"empty_layer": true
},
{
"created": "2021-11-17T10:37:40.480423114Z",
"created_by": "/bin/sh -c #(nop) ENV PKG_RELEASE=1~bullseye",
"empty_layer": true
},
{
"created": "2021-11-17T10:38:11.674629445Z",
"created_by": "/bin/sh -c set -x && addgroup --system --gid 101 nginx && adduser --system --disabled-login --ingroup nginx --no-create-home --home /nonexistent --gecos \"nginx user\" --shell /bin/false --uid 101 nginx && apt-get update && apt-get install --no-install-recommends --no-install-suggests -y gnupg1 ca-certificates && NGINX_GPGKEY=573BFD6B3D8FBC641079A6ABABF5BD827BD9BF62; found=''; for server in hkp://keyserver.ubuntu.com:80 pgp.mit.edu ; do echo \"Fetching GPG key $NGINX_GPGKEY from $server\"; apt-key adv --keyserver \"$server\" --keyserver-options timeout=10 --recv-keys \"$NGINX_GPGKEY\" && found=yes && break; done; test -z \"$found\" && echo >&2 \"error: failed to fetch GPG key $NGINX_GPGKEY\" && exit 1; apt-get remove --purge --auto-remove -y gnupg1 && rm -rf /var/lib/apt/lists/* && dpkgArch=\"$(dpkg --print-architecture)\" && nginxPackages=\" nginx=${NGINX_VERSION}-${PKG_RELEASE} nginx-module-xslt=${NGINX_VERSION}-${PKG_RELEASE} nginx-module-geoip=${NGINX_VERSION}-${PKG_RELEASE} nginx-module-image-filter=${NGINX_VERSION}-${PKG_RELEASE} nginx-module-njs=${NGINX_VERSION}+${NJS_VERSION}-${PKG_RELEASE} \" && case \"$dpkgArch\" in amd64|arm64) echo \"deb https://nginx.org/packages/mainline/debian/ bullseye nginx\" >> /etc/apt/sources.list.d/nginx.list && apt-get update ;; *) echo \"deb-src https://nginx.org/packages/mainline/debian/ bullseye nginx\" >> /etc/apt/sources.list.d/nginx.list && tempDir=\"$(mktemp -d)\" && chmod 777 \"$tempDir\" && savedAptMark=\"$(apt-mark showmanual)\" && apt-get update && apt-get build-dep -y $nginxPackages && ( cd \"$tempDir\" && DEB_BUILD_OPTIONS=\"nocheck parallel=$(nproc)\" apt-get source --compile $nginxPackages ) && apt-mark showmanual | xargs apt-mark auto > /dev/null && { [ -z \"$savedAptMark\" ] || apt-mark manual $savedAptMark; } && ls -lAFh \"$tempDir\" && ( cd \"$tempDir\" && dpkg-scanpackages . > Packages ) && grep '^Package: ' \"$tempDir/Packages\" && echo \"deb [ trusted=yes ] file://$tempDir ./\" > /etc/apt/sources.list.d/temp.list && apt-get -o Acquire::GzipIndexes=false update ;; esac && apt-get install --no-install-recommends --no-install-suggests -y $nginxPackages gettext-base curl && apt-get remove --purge --auto-remove -y && rm -rf /var/lib/apt/lists/* /etc/apt/sources.list.d/nginx.list && if [ -n \"$tempDir\" ]; then apt-get purge -y --auto-remove && rm -rf \"$tempDir\" /etc/apt/sources.list.d/temp.list; fi && ln -sf /dev/stdout /var/log/nginx/access.log && ln -sf /dev/stderr /var/log/nginx/error.log && mkdir /docker-entrypoint.d"
},
{
"created": "2021-11-17T10:38:12.409891183Z",
"created_by": "/bin/sh -c #(nop) COPY file:65504f71f5855ca017fb64d502ce873a31b2e0decd75297a8fb0a287f97acf92 in / "
},
{
"created": "2021-11-17T10:38:12.732754797Z",
"created_by": "/bin/sh -c #(nop) COPY file:0b866ff3fc1ef5b03c4e6c8c513ae014f691fb05d530257dfffd07035c1b75da in /docker-entrypoint.d "
},
{
"created": "2021-11-17T10:38:13.174315469Z",
"created_by": "/bin/sh -c #(nop) COPY file:0fd5fca330dcd6a7de297435e32af634f29f7132ed0550d342cad9fd20158258 in /docker-entrypoint.d "
},
{
"created": "2021-11-17T10:38:13.510082553Z",
"created_by": "/bin/sh -c #(nop) COPY file:09a214a3e07c919af2fb2d7c749ccbc446b8c10eb217366e5a65640ee9edcc25 in /docker-entrypoint.d "
},
{
"created": "2021-11-17T10:38:13.827956179Z",
"created_by": "/bin/sh -c #(nop) ENTRYPOINT [\"/docker-entrypoint.sh\"]",
"empty_layer": true
},
{
"created": "2021-11-17T10:38:14.069756108Z",
"created_by": "/bin/sh -c #(nop) EXPOSE 80",
"empty_layer": true
},
{
"created": "2021-11-17T10:38:14.348754639Z",
"created_by": "/bin/sh -c #(nop) STOPSIGNAL SIGQUIT",
"empty_layer": true
},
{
"created": "2021-11-17T10:38:14.652464384Z",
"created_by": "/bin/sh -c #(nop) CMD [\"nginx\" \"-g\" \"daemon off;\"]",
"empty_layer": true
}
]
}
There are many properties in the image configurations file.
config
represents parameters to run the image. It may contain different kinds of parameters. For example,Env
represents environment variables,ExposedPorts
represents exposed ports,Entrypoint
represents entry point of the container.rootfs
represents layers in the image.history
represents historical events for image layers. Ifempty_layer
istrue
, it means no layers are created by this event.
Image Layers
We can also view content of image layers. Layers are gzipped tar files. The command below extracts layer content into ~/files
.
$ tar -xf blobs/sha256/266f639b35ad602ee76c3b4d4cf88285a50adf8f561d8d96d331db732fe16982 -C ~/files
Below is the content of ~/files
directory. The layer only contains one file /docker-entrypoint.d/30-tune-worker-processes.sh
.
$ tree --du ~/files/
/home/ubuntu/files/
└── [ 8709] docker-entrypoint.d
└── [ 4613] 30-tune-worker-processes.sh
12805 bytes used in 1 directory, 1 file
By comparing with the Dockerfile of Nginx image, we can see that this layer is created with the following COPY
instruction.
COPY 30-tune-worker-processes.sh /docker-entrypoint.d
Now we have established the link between Dockerfiles and image layers.