initial commit

This commit is contained in:
John Bowdre 2022-08-05 16:29:22 -05:00
commit 2bce5b0727
16 changed files with 625 additions and 0 deletions

7
.gitignore vendored Normal file
View File

@ -0,0 +1,7 @@
*.vmdk
*.ovf
*.mf
*.json
id_*
authorized_keys
*.pem

128
README.md Normal file
View File

@ -0,0 +1,128 @@
# library-syncer
This project aims to ease some of the pains encountered when attempting to sync VM templates in a [VMware vSphere Content Library](https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.vm_admin.doc/GUID-254B2CE8-20A8-43F0-90E8-3F6776C2C896.html) to a large number of geographically-remote sites under less-than-ideal networking conditions.
## Overview
The solution leverages lightweight Docker containers in server and client roles. The servers would be deployed at the primary datacenter(s), and the clients at the remote sites. The servers make a specified library folder available for the clients to periodically synchronize using `rsync` over SSH, which allows for delta syncs so that bandwidth isn't wasted transferring large VMDK files when only small portions have changed.
Once the sync has completed, each client runs a [Python script](client/build/update_library_manifests.py) to generate/update a Content Library JSON manifest which is then published over HTTP/HTTPS (courtesy of [Caddy](https://caddyserver.com/)). Traditional Content Libraries at the local site can connect to this as a [subscribed library](https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.vm_admin.doc/GUID-9DE2BD8F-E499-4F1E-956B-67212DE593C6.html) to make the synced items available within vSphere.
The rough architecture looks something like this:
```
|
PRIMARY SITE | REMOTE SITES +----------------------------+
| | vSphere |
| +----------------+ | +--------------------+ |
| | | | | | |
| | library-syncer | | | subscribed content | |
+--+--->| +--+-->| | |
| | | client | | | library | |
| | | | | | | |
| | +----------------+ | +--------------------+ |
| | | |
+-----------------+ | | +----------------+ | +--------------------+ |
| | | | | | | | | |
| library-syncer | | | | library-syncer | | | subscribed content | |
| +--+--+--->| +--+-->| | |
| server | | | | client | | | library | |
| | | | | | | | | |
+-----------------+ | | +----------------+ | +--------------------+ |
| | | |
| | +----------------+ | +--------------------+ |
| | | | | | | |
| | | library-syncer | | | subscribed content | |
+--+--->| +--+-->| | |
| | client | | | library | |
| | | | | | |
| +----------------+ | +--------------------+ |
| +----------------------------+
```
## Prerequisites
### Rsync user SSH keypair
The server image includes a `syncer` user account which the clients will use to authenticate over SSH. This account is locked down and restricted with `rrsync` to only be able to run `rsync` commands. All that you need to do is generate a keypair for the account to use:
```shell
ssh-keygen -t rsa -b 4096 -N "" -f id_syncer
```
Place the generated `id_syncer` *private* key in `./data/ssh/` on the *client* Docker hosts, and the `id_syncer.pub` *public* key in `./data/ssh/` on the *server* Docker host.
### TLS certificate pair (optional)
By default, the client will publish its library over HTTP. If you set the `TLS_NAME` environment variable to the server's publicly-accessible FQDN, the Caddy web server will [automatically retrieve and apply a certificate issued by Let's Encrypt](https://caddyserver.com/docs/automatic-https). For deployments on internal networks which need to use a certificate issued by an internal CA, you can set `TLS_CUSTOM_CERT=true` and place the PEM-formatted certificate *and* private key in the client's `./data/certs/` directory, named `cert.pem` and `key.pem` respectively.
You can generate the cert signing request and key in one shot like this:
```shell
openssl req -new \
-newkey rsa:4096 -nodes -keyout library.example.com.key \
-out library.example.com.csr \
-subj "/C=US/ST=Somestate/L=Somecity/O=Example.com/OU=LAB/CN=library.example.com"
```
## Usage
### Server
Directory structure:
```
.
├── data
│   ├── library
│   └── ssh
│   └── id_syncer.pub
└── docker-compose.yaml
```
`docker-compose.yaml`:
```yaml
version: '3'
services:
library-syncer-server:
container_name: library-syncer-server
restart: unless-stopped
image: ghcr.io/jbowdre/library-syncer-server:latest
environment:
- TZ=UTC
ports:
- "2222:22"
volumes:
- './data/ssh:/home/syncer/.ssh'
- './data/library:/syncer/library'
```
### Client
Directory structure:
```
.
├── data
│   ├── certs
│   │   ├── cert.pem
│   │   └── key.pem
│   ├── library
│   └── ssh
│   └── id_syncer
└── docker-compose.yaml
```
`docker-compose.yaml`:
```yaml
version: '3'
services:
library-syncer-client:
container_name: library-syncer-client
restart: unless-stopped
image: ghcr.io/jbowdre/library-syncer-client:latest
environment:
- TZ=UTC
- SYNC_PEER=deb01.lab.bowdre.net
- SYNC_PORT=2222
- SYNC_SCHEDULE=0 21 * * 5
- SYNC_DELAY=true
- TLS_NAME=library.lab.bowdre.net
- TLS_CUSTOM_CERT=true
ports:
- "80:80/tcp"
- "443:443/tcp"
volumes:
- './data/ssh:/syncer/.ssh'
- './data/library:/syncer/library'
- './data/certs:/etc/caddycerts'
```

4
client/build/Caddyfile Normal file
View File

@ -0,0 +1,4 @@
:80 {
root * /syncer/library
file_server
}

30
client/build/Dockerfile Normal file
View File

@ -0,0 +1,30 @@
FROM alpine:3.16
LABEL org.opencontainers.image.source="https://github.com/jbowdre/library-syncer"
ENV CRONTAB_FILE=/var/spool/cron/crontabs/root
EXPOSE 80/tcp
RUN apk add --no-cache \
caddy \
openssh-client \
python3 \
rsync \
tzdata
RUN mkdir /syncer
COPY ./Caddyfile /etc/caddy/Caddyfile
COPY ./update_library_manifests.py /syncer/
COPY ./sync.sh /syncer/
COPY ./entrypoint.sh /
RUN chmod +x /entrypoint.sh \
&& chmod +x /syncer/sync.sh \
&& rm -rf /tmp/*
VOLUME ["/syncer/library", "/syncer/.ssh"]
ENTRYPOINT [ "/entrypoint.sh" ]
CMD [ "sh", "-c", "crond -l 2 -f" ]

View File

@ -0,0 +1,39 @@
#!/bin/sh
set -e
chmod 600 /syncer/.ssh/id_syncer
echo -e "\n[$(date +"%Y/%m/%d-%H:%M:%S")] Performing initial sync..."
/syncer/sync.sh now > /proc/self/fd/1 2>/proc/self/fd/2
echo -e "\n[$(date +"%Y/%m/%d-%H:%M:%S")] Creating cron job..."
if [ "$SYNC_DELAY" == "false" ]; then
echo "$SYNC_SCHEDULE /syncer/sync.sh now > /proc/self/fd/1 2>/proc/self/fd/2" >> $CRONTAB_FILE
else
echo "$SYNC_SCHEDULE /syncer/sync.sh > /proc/self/fd/1 2>/proc/self/fd/2" >> $CRONTAB_FILE
fi
chmod 0644 $CRONTAB_FILE
if [ "$TLS_NAME" != "" ]; then
if [ "$TLS_CUSTOM_CERT" == "true" ]; then
cat << EOF > /etc/caddy/Caddyfile
$TLS_FQDN {
root * /syncer/library
file_server
tls /etc/caddycerts/cert.pem /etc/caddycerts/key.pem
}
EOF
else
cat << EOF > /etc/caddy/Caddyfile
$TLS_FQDN {
root * /syncer/library
file_server
}
EOF
fi
fi
echo -e "\n[$(date +"%Y/%m/%d-%H:%M:%S")] Starting caddy..."
/usr/sbin/caddy start -config /etc/caddy/Caddyfile
echo -e "\n[$(date +"%Y/%m/%d-%H:%M:%S")] Starting cron..."
exec "$@"

22
client/build/sync.sh Normal file
View File

@ -0,0 +1,22 @@
#!/bin/sh
set -e
# initial sync is immediate, cron syncs have a random delay unless $CRON_DELAY==false
if [ $1 != "now" ]; then
echo -e "\n[$(date +"%Y/%m/%d-%H:%M:%S")] Waiting for random delay..."
sleep $(( RANDOM ))
echo -e "[$(date +"%Y/%m/%d-%H:%M:%S")] Sync starts NOW!"
else
echo -e "\n[$(date +"%Y/%m/%d-%H:%M:%S")] Immediate sync starts NOW!"
fi
# sync
echo -e "[$(date +"%Y/%m/%d-%H:%M:%S")] Rsyncing content..."
/usr/bin/rsync -e "ssh -l syncer -p $SYNC_PORT -i /syncer/.ssh/id_syncer -o StrictHostKeyChecking=no" -av --exclude '*.json' $SYNC_PEER:/ /syncer/library
# generate content library manifest
echo -e "[$(date +"%Y/%m/%d-%H:%M:%S")] Generating content library manifest..."
/usr/bin/python3 /syncer/update_library_manifests.py -n 'Library' -p /syncer/library/
echo -e "[$(date +"%Y/%m/%d-%H:%M:%S")] Sync tasks complete!\n"

View File

@ -0,0 +1,307 @@
"""
Generates a library ready to be used as a VCSP endpoint for content library 2016 (vsphere 6.5) and beyond.
Adapted from https://github.com/lamw/vmware-scripts/blob/master/python/make_vcsp_2022.py
"""
__author__ = 'VMware, Inc.'
__copyright__ = 'Copyright 2019 VMware, Inc. All rights reserved.'
import argparse
import datetime
import hashlib
import logging
import json
import os
import uuid
import sys
import urllib.parse
VCSP_VERSION = 2
ISO_FORMAT = "%Y-%m-%dT%H:%MZ"
FORMAT = "json"
FILE_EXTENSION_CERT = ".cert"
LIB_FILE = ''.join(("lib", os.extsep, FORMAT))
ITEMS_FILE = ''.join(("items", os.extsep, FORMAT))
ITEM_FILE = ''.join(("item", os.extsep, FORMAT))
VCSP_TYPE_OVF = "vcsp.ovf"
VCSP_TYPE_ISO = "vcsp.iso"
VCSP_TYPE_OTHER = "vcsp.other"
logger = logging.getLogger(__name__)
def _md5_for_file(f, md5=None, block_size=2**20):
if md5 is None:
md5 = hashlib.md5()
while True:
data = f.read(block_size)
if not data:
break
md5.update(data)
return md5
def _md5_for_folder(folder):
md5 = None
for files in os.listdir(folder):
if ITEM_FILE not in files:
with open(os.path.join(folder, files), "rb") as handle:
md5 = _md5_for_file(handle, md5)
return md5.hexdigest()
def _make_lib(name, id=uuid.uuid4(), creation=datetime.datetime.now(), version=1):
return {
"vcspVersion": str(VCSP_VERSION),
"version": str(version),
"contentVersion": "1",
"name": name,
"id": "urn:uuid:%s" % id,
"created": creation.strftime(ISO_FORMAT),
"capabilities": {
"transferIn": [ "httpGet" ],
"transferOut": [ "httpGet" ],
},
"itemsHref": ITEMS_FILE
}
def _make_item(directory, vcsp_type, name, files, description="", properties={},
identifier=uuid.uuid4(), creation=datetime.datetime.now(), version=2,
library_id="", is_vapp_template="false"):
'''
add type adapter metadata for OVF template
'''
if len(name) > 80: #max size of name is 80 chars
extension = name.rsplit(".")[-1]
name = name[0:80-len(extension)-1]+"."+extension
if "urn:uuid:" not in str(identifier):
item_id = "urn:uuid:%s" % identifier
else:
item_id = identifier
type_metadata = None
if vcsp_type == VCSP_TYPE_OVF:
# generate sample type metadata for OVF template so that subscriber can show OVF VM type
type_metadata_value = "{'id':'%s','version':'%s','libraryIdParent':'%s','isVappTemplate':'%s','vmTemplate':null,'vappTemplate':null,'networks':[],'storagePolicyGroups':null}" % (item_id, str(version), library_id, is_vapp_template)
type_metadata = {
"key": "type-metadata",
"value": type_metadata_value,
"type": "String",
"domain": "SYSTEM",
"visibility": "READONLY"
}
if type_metadata:
return {
"created": creation.strftime(ISO_FORMAT),
"description": description,
"version": str(version),
"files": files,
"id": item_id,
"name": name,
"metadata": [type_metadata],
"properties": properties,
"selfHref": "%s/%s" % (urllib.parse.quote(directory), urllib.parse.quote(ITEM_FILE)),
"type": vcsp_type
}
else:
return {
"created": creation.strftime(ISO_FORMAT),
"description": description,
"version": str(version),
"files": files,
"id": item_id,
"name": name,
"properties": properties,
"selfHref": "%s/%s" % (urllib.parse.quote(directory), urllib.parse.quote(ITEM_FILE)),
"type": vcsp_type
}
def _make_items(items, version=1):
return {
"items": items
}
def _dir2item(path, directory, md5_enabled, lib_id):
files_items = []
name = os.path.split(path)[-1]
vcsp_type = VCSP_TYPE_OTHER
folder = ""
folder_md5 = ""
is_vapp = ""
for f in os.listdir(path):
if f == ".DS_Store" or f == ''.join((directory, os.extsep, FORMAT)):
continue
else:
if f == "item.json":
continue # skip the item.json meta data files
p = os.path.join(path, f)
m = hashlib.md5()
new_folder = os.path.dirname(p)
if new_folder != folder: # new folder (ex: template1/)
if md5_enabled:
folder_md5 = _md5_for_folder(new_folder)
folder = new_folder
if md5_enabled:
m.update(os.path.dirname(p).encode('utf-8'))
if ".ovf" in p:
vcsp_type = VCSP_TYPE_OVF
# TODO: ready ovf descriptor for type metadata
is_vapp = "false"
elif ".iso" in p:
vcsp_type = VCSP_TYPE_ISO
size = os.path.getsize(p)
href = "%s/%s" % (directory, f)
h = ""
if md5_enabled:
with open(p, "rb") as handle:
h = _md5_for_file(handle)
files_items.append({
"name": f,
"size": size,
"etag": folder_md5,
"hrefs": [ urllib.parse.quote(href,safe="/")]
})
return _make_item(name, vcsp_type, name, files_items, identifier = uuid.uuid4(), library_id=lib_id, is_vapp_template=is_vapp)
def make_vcsp(lib_name, lib_path, md5_enabled):
lib_json_loc = os.path.join(lib_path, LIB_FILE)
lib_items_json_loc = os.path.join(lib_path, ITEMS_FILE)
lib_id = uuid.uuid4()
lib_create = datetime.datetime.now()
lib_version = 1
updating_lib = False
if os.path.isfile(lib_json_loc):
logger.info("%s already exists (%s)" % (LIB_FILE, lib_json_loc))
try:
with open(lib_json_loc, "r") as f:
old_lib = json.load(f)
if "id" in old_lib:
lib_id = old_lib["id"].split(":")[-1]
if "created" in old_lib:
lib_create = datetime.datetime.strptime(old_lib["created"], ISO_FORMAT)
if "version" in old_lib:
lib_version = old_lib["version"]
updating_lib = True
except:
logger.error("Failed to read %s" % lib_json_loc)
pass
old_items = {}
if os.path.isfile(lib_items_json_loc):
logger.info("%s already exists (%s)" % (ITEMS_FILE, lib_items_json_loc))
try:
with open(lib_items_json_loc, "r") as f:
old_data = json.load(f)
for item in old_data["items"]:
old_items[item["name"]] = item
except:
logger.error("Failed to read %s" % lib_items_json_loc)
pass
items = []
changed = False
for item_path in os.listdir(lib_path):
p = os.path.join(lib_path, item_path)
if not os.path.isdir(p):
continue # not interesting
item_json = _dir2item(p, item_path, md5_enabled, "urn:uuid:%s" % lib_id)
if item_path not in old_items and updating_lib:
changed = True
elif item_path in old_items:
file_changed = False
item_json["id"] = old_items[item_path]["id"]
item_json["created"] = old_items[item_path]["created"]
item_json["version"] = old_items[item_path]["version"]
file_names = set([i["name"] for i in item_json["files"]])
old_file_names = set([i["name"] for i in old_items[item_path]["files"]])
if file_names != old_file_names:
# files added or removed
changed = True
file_changed = True
for f in item_json["files"]:
if file_changed:
break
for old_f in old_items[item_path]["files"]:
if f["name"] == old_f["name"] and f["etag"] != old_f["etag"]:
changed = True
file_changed = True
break
if file_changed:
item_version = int(item_json["version"])
item_json["version"] = str(item_version + 1)
del old_items[item_path]
json_item_file = ''.join((p, os.sep, ITEM_FILE))
with open(json_item_file, "w") as f:
json.dump(item_json, f, indent=2)
items.append(item_json)
if updating_lib and len(old_items) != 0:
changed = True # items were removed
if updating_lib and not changed:
logger.info("Nothing to update, quitting")
return
if changed:
lib_version = int(lib_version)
lib_version += 1
logger.info("Saving results to %s and %s" % (lib_json_loc, lib_items_json_loc))
with open(lib_json_loc, "w") as f:
json.dump(_make_lib(lib_name, lib_id, lib_create, lib_version), f, indent=2)
with open(lib_items_json_loc, "w") as f:
json.dump(_make_items(items, lib_version), f, indent=2)
def _get_item(json_object, name, value):
return [obj for obj in json_object if obj[name]==value]
def parse_options():
"""
Parse command line options
"""
parser = argparse.ArgumentParser(usage=usage())
# Run options
parser.add_argument('-n', '--name', dest='name',
help="library name")
parser.add_argument('-path', '--path', dest='path',
help="library path on storage")
parser.add_argument('--etag', dest='etag',
default='true', help="generate etag")
parser.add_argument('--skip-cert', dest='skip_cert',
default='true', help="skip OVF cert")
args = parser.parse_args()
if args.name is None or args.path is None:
parser.print_help()
sys.exit(1)
return args
def usage():
'''
The usage message for the argument parser.
'''
return """Usage: python update_library_manifests.py -n <library-name> -p <library-storage-path> --etag <true or false, default true>
--skip-cert <true or fale, default true>
"""
def main():
args = parse_options()
lib_name = args.name
lib_path = args.path
md5_enabled = args.etag == 'true' or args.etag == 'True'
skip_cert = args.skip_cert == 'true' or args.skip_cert == 'True'
make_vcsp(lib_name, lib_path, md5_enabled)
if __name__ == "__main__":
main()

View File

View File

0
client/data/ssh/.gitkeep Normal file
View File

View File

@ -0,0 +1,21 @@
version: '3'
services:
library-syncer-client:
container_name: library-syncer-client
restart: unless-stopped
image: ghcr.io/jbowdre/library-syncer-client:latest
environment:
- TZ=UTC
- SYNC_PEER=deb01.lab.bowdre.net
- SYNC_PORT=2222
- SYNC_SCHEDULE=0 21 * * 5
- SYNC_DELAY=true
- TLS_NAME=library.lab.bowdre.net
- TLS_CUSTOM_CERT=true
ports:
- "80:80/tcp"
- "443:443/tcp"
volumes:
- './data/ssh:/syncer/.ssh'
- './data/library:/syncer/library'
- './data/certs:/etc/caddycerts'

29
server/build/Dockerfile Normal file
View File

@ -0,0 +1,29 @@
FROM alpine:3.16
LABEL org.opencontainers.image.source="https://github.com/jbowdre/library-syncer"
ENV SYNC_CMD='command="/usr/bin/rrsync -ro /syncer/library/",no-agent-forwarding,no-port-forwarding,no-pty,no-user-rc,no-X11-forwarding'
RUN apk add --no-cache \
openssh-server \
rsync \
rrsync \
tzdata
RUN mkdir /syncer \
&& adduser -h /home/syncer -D -s /bin/sh syncer
USER syncer
RUN mkdir -m 700 /home/syncer/.ssh
USER root
COPY ./entrypoint.sh /
RUN chmod +x /entrypoint.sh \
&& rm -rf /tmp/* \
&& rm -rf /etc/ssh/ssh_host_rsa_key /etc/ssh/ssh_host_dsa_key
EXPOSE 22/tcp
VOLUME [ "/syncer/library", "/home/syncer/.ssh" ]
ENTRYPOINT [ "/entrypoint.sh" ]
CMD [ "/usr/sbin/sshd", "-D" ]

View File

@ -0,0 +1,25 @@
#!/bin/sh
set -e
# set ssh config permissions
echo "$SYNC_CMD $(cat /home/syncer/.ssh/id_syncer.pub)" > /home/syncer/.ssh/authorized_keys
chown syncer:syncer /home/syncer/.ssh/authorized_keys && chmod 600 /home/syncer/.ssh/authorized_keys
if [ $(getent shadow syncer | awk 'BEGIN { FS = ":" } ; { print $2 }') == '!' ]; then
passwd -u syncer
fi
if [ ! -f "/etc/ssh/ssh_host_rsa_key" ]; then
ssh-keygen -f /etc/ssh/ssh_host_rsa_key -N '' -t rsa
fi
if [ ! -f "/etc/ssh/ssh_host_dsa_key" ]; then
ssh-keygen -f /etc/ssh/ssh_host_dsa_key -N '' -t dsa
fi
if [ ! -d "/var/run/sshd" ]; then
mkdir -p /var/run/sshd
fi
sed -i "s/^#PasswordAuthentication yes/PasswordAuthentication no/g" /etc/ssh/sshd_config
echo -e "\n[$(date +"%Y/%m/%d-%H:%M:%S")] Starting sshd..."
exec "$@"

View File

0
server/data/ssh/.gitkeep Normal file
View File

View File

@ -0,0 +1,13 @@
version: '3'
services:
library-syncer-server:
container_name: library-syncer-server
restart: unless-stopped
image: ghcr.io/jbowdre/library-syncer-server:latest
environment:
- TZ=UTC
ports:
- "2222:22"
volumes:
- './data/ssh:/home/syncer/.ssh'
- './data/library:/syncer/library'