I have been following this blog from Oracle on how to add Longhorn so I can share the free storage out better. I found that there are some issues so I have documented the fixes here.
Table of contents
Open Table of contents
Dynamic Group, what?
There is a sneaky line in before the script that says “You need to create a dynamic group and provide this group manage access to block storage, this allows us to use instance principal authentication in our script.” with no reference as to what a dynamic group is or what needs to go into it. I am not sure that I have used the most secure settings here so I would welcome and tips on how to do it properly.
Steps:
- Get the OCID for the compartment you are adding your OKE Cluster to. In the search box type compartment, find it in the table and hover over the data in the OCID col and there is a copy button in the tooltip.
- In the menu on the right click
Domains
, then the one in the table footer with(Current Domain)
next to it. Mine isDefault
- Again in the new right menu, click
Dynamic Groups
- Click
Create dynamic group
. - Name it. I called my
Block
- In the rule section there is a link to the
rule builder
, click it to open the modal. - Change the
Match instance with
dropdown toCompartment OCID
then paste the OCID from step 1, clickAdd rule
. - Click
Save
.
We have the group, now we need to grant them permission to manage block storage.
Steps:
- Click on
Policies
in the right menu. - Click
Create Policy
- Name it, I called mine block again.
- Add a
Description
- Select your compartment (the one you copied the OCID for earlier)
- Select
Storage MAnagement
in thePolicy use case
dropdown - Select
Let volume admins manage block volumes, backups, and volume groups
in theCommon policy templates
dropdown - Set the
Identity domain
to the one used earlier. - Toggle the
Dynamic groups
radio button then pick the one we made above - Set the location to the same compartment in step 4
- Save
Now you should have the required permissions for the script.
The Script needs formatting
The shell script they provided has not been formatted for the Python to work properly. Everything after the # Call instance metadata uri to get current instace details
line needs to be unindented
#!/bin/bash
curl --fail -H "Authorization: Bearer Oracle" -L0 http://169.254.169.254/opc/v2/instance/metadata/oke_init_script | base64 --decode >/var/run/oke-init.sh
bash /var/run/oke-init.sh
echo "installing python3-pip , oci sdk\n"
sudo yum install python3 -y
sudo yum install python3-pip -y
pip3 install oci
pip3 install requests
cat << EOF > pyscript.py
#!/usr/bin/python
import oci
import requests
size_in_gbs = 200
vpus_per_gb = 10
mode = 'PARA'
device_path = "/dev/oracleoci/oraclevdb"
signer = oci.auth.signers.InstancePrincipalsSecurityTokenSigner()
compute_client = oci.core.ComputeClient({}, signer = signer)
block_storage_client = oci.core.BlockstorageClient({}, signer = signer)
def get_current_instance_details():
r = requests.get(url= 'http://169.254.169.254/opc/v1/instance')
return r.json()
def create_volume(block_storage, compartment_id, availability_domain, display_name: str):
print("--- creating block volume ---")
result = block_storage.create_volume(
oci.core.models.CreateVolumeDetails(
compartment_id=compartment_id,
availability_domain=availability_domain,
display_name=display_name,
size_in_gbs = size_in_gbs,
vpus_per_gb = vpus_per_gb
)
)
volume = oci.wait_until(
block_storage,
block_storage.get_volume(result.data.id),
'lifecycle_state',
'AVAILABLE'
).data
print('--- Created Volume ocid: {} ---'.format(result.data.id))
return volume
def attach_volume(instance_id, volume_id,device_path):
volume_attachment_response = ""
if mode == 'ISCSI':
print("--- Attaching block volume {} to instance {}---".format(volume_id,instance_id))
volume_attachment_response = compute_client.attach_volume(
oci.core.models.AttachIScsiVolumeDetails(
display_name='IscsiVolAttachment',
instance_id=instance_id,
volume_id=volume_id,
device= device_path
)
)
elif mode == 'PARA':
volume_attachment_response = compute_client.attach_volume(
oci.core.models.AttachParavirtualizedVolumeDetails(
display_name='ParavirtualizedVolAttachment',
instance_id=instance_id,
volume_id=volume_id,
device= device_path
)
)
oci.wait_until(
compute_client,
compute_client.get_volume_attachment(volume_attachment_response.data.id),
'lifecycle_state',
'ATTACHED'
)
print("--- Attaching complete block volume {} to instance {}---".format(volume_id,instance_id))
print(volume_attachment_response.data)
# Call instance metadata uri to get current instace details
instanceDetails = get_current_instance_details()
print(instanceDetails)
volume = create_volume(block_storage= block_storage_client, compartment_id= instanceDetails['compartmentId'], availability_domain=instanceDetails['availabilityDomain'], display_name= instanceDetails['displayName'])
attach_volume(instance_id=instanceDetails['id'], volume_id=volume.id, device_path= device_path)
EOF
echo "running python script\n"
chmod 755 pyscript.py
./pyscript.py
echo "creating file system on volume\n"
sudo /sbin/mkfs.ext4 /dev/oracleoci/oraclevdb
echo "mounting volume\n"
sudo mkdir /mnt/volume
sudo mount /dev/oracleoci/oraclevdb /mnt/volume
echo "adding entry to fstab\n"
echo "/dev/oracleoci/oraclevdb /mnt/volume ext4 defaults,_netdev,nofail 0 2" | sudo tee -a /etc/fstab
Formatting is hard
The patch file for adding the Longhorn annotations and labels is not valid yaml and needs formatting as below:
metadata:
labels:
node.longhorn.io/create-default-disk: "config"
annotations:
node.longhorn.io/default-disks-config: '[
{
"path":"/var/lib/longhorn",
"allowScheduling":true
},
{
"name":"fast-ssd-disk",
"path":"/mnt/volume",
"allowScheduling":true,
"tags":[ "ssd", "fast" ]
}
]'
Take the Helm… or not
The command given does not have the install add repo command to add the helm repo:
helm repo add longhorn https://charts.longhorn.io
and the version is out of date for the 1.30.1 cluster provisioned. I ommited the version and took the latest at the time of writing.
Lets get our UI on
There are a number of issues with the UI part.
Firstly the yaml is badly formatted again…
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: longhorn-ingress
namespace: longhorn-system
annotations:
kubernetes.io/ingress.class: "nginx"
spec:
rules:
- http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: longhorn-frontend
port:
number: 80
Then there is the issue that nginx is not on the system so needs to be added via their helm chart
helm upgrade --install ingress-nginx ingress-nginx \
--repo https://kubernetes.github.io/ingress-nginx \
--namespace ingress-nginx --create-namespace
The the class annotation has been depricated so the code needs changing to:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: longhorn-ingress
namespace: longhorn-system
spec:
ingressClassName: nginx
rules:
- http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: longhorn-frontend
port:
number: 80
Other Issues
Where to live
I had an issue with the Availability domain, my nodes were being created in kEHF:UK-LONDON-1-AD-1
and I was not able to add any storage to this. I moved my nodes to kEHF:UK-LONDON-1-AD-3
where I was able to make Block Storage (I tested it via manually creating them in the block storage screen).
Too big to fit
The script has the size at 200gb, which I think is the max you can have in each zone on the free tier. I tried to reduced it down to 50gb, but they still failed. In the end I manually created and attached them to the nodes. If you manually create the mounts you will need to run the following:
sudo /sbin/mkfs.ext4 /dev/oracleoci/oraclevdb
sudo mkdir /mnt/volume
sudo mount /dev/oracleoci/oraclevdb /mnt/volume
echo "/dev/oracleoci/oraclevdb /mnt/volume ext4 defaults,_netdev,nofail 0 2" | sudo tee -a /etc/fstab