Dosing automations start in leader, but not in workers

Pedre91 · September 24, 2024, 11:06am

Hi,

I have added two custom automation plugins. I followed the instructions in THIS post:

I added the .py files here:
pioreactor@leader.local:/home/pioreactor/.pioreactor/plugins

The yaml files are hosted here:
pioreactor@leader.local:/var/www/pioreactorui/contrib/automations/dosing/

I have no problem running them in the leader through the ui (Pioreactor → Manage → Dosing automation → Morbidostat LT). However, although they are available in the dosing automation options of the two workers (worker1 and worker2) of the cluster, when I click on “Start” a message appears (“Starting dosing automation”) but it doesn’t start. The same thing happens when I run another dosing automation from the ones available by default, such as “Chemostat”.

Is it necessary to copy the .py and .yaml files in each worker or, on the contrary, is there another way to do it?

I have tried to copy the files from my laptop to worker1 and I get the following message:

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the ED25519 key sent by the remote host is
SHA256:gyi8UVwfxekl1+qfPJwqsDsEjrWY25Elga1IcMyzVz4.
Please contact your system administrator.
Add correct host key in /home/laptop/.ssh/known_hosts to get rid of this message.
Offending ECDSA key in /home/laptop/.ssh/known_hosts:10
  remove with:
  ssh-keygen -f "/home/laptop/.ssh/known_hosts" -R "worker1.local"
Host key for worker1.local has changed and you have requested strict checking.
Host key verification failed.

Cheers!

Pedre91 · September 24, 2024, 11:55am

Well, I managed to copy the .py files to workers from the leader instead from my laptop. Anyway, this doesn’t fix fix the problem. When I check the plugin list in each pioreactor, I get the follow output:

pioreactor@leader:~ $ pio plugins list
/home/pioreactor/.pioreactor/plugins/__init__.py encountered plugin load error: attempted relative import with no known parent package
base==Unknown
chemostat==Unknown
fed_batch==Unknown
morbidostat_classic==Unknown
morbidostat_lt==Unknown
pid_morbidostat==Unknown
silent==Unknown
turbidostat==Unknown
pioreactor@leader:~ $ ssh pioreactor@worker1.local
Last login: Wed Jul 17 15:32:21 2024 from 192.168.1.220
pioreactor@worker1:~ $ pio plugins list
/home/pioreactor/.pioreactor/plugins/__init__.py encountered plugin load error: attempted relative import with no known parent package
/home/pioreactor/.pioreactor/plugins/base.py encountered plugin load error: No module named 'pioreactor.background_jobs.dosing_control'
chemostat==Unknown
fed_batch==Unknown
morbidostat_classic==Unknown
morbidostat_lt==Unknown
pid_morbidostat==Unknown
silent==Unknown
turbidostat==Unknown
pioreactor@worker1:~ $ ssh pioreactor@worker2.local
Last login: Wed Jul 17 15:34:41 2024 from 192.168.1.251
pioreactor@worker2:~ $ pio plugins list
/home/pioreactor/.pioreactor/plugins/__init__.py encountered plugin load error: attempted relative import with no known parent package
/home/pioreactor/.pioreactor/plugins/base.py encountered plugin load error: No module named 'pioreactor.background_jobs.dosing_control'
chemostat==Unknown
fed_batch==Unknown
morbidostat_classic==Unknown
morbidostat_lt==Unknown
pid_morbidostat==Unknown
silent==Unknown
turbidostat==Unknown

CamDavidsonPilon · September 24, 2024, 1:57pm

Hi @Pedre91,

Happy to help,

Is it necessary to copy the .py and .yaml files in each worker or, on the contrary, is there another way to do it?

You need to copy the .py files over to the workers, yes. Not the .yaml. There’s an easy utility to copy files from the leader to all the workers in one command:

pios cp /home/pioreactor/.pioreactor/plugins/my_plugin.py

will send the leader’s file /home/pioreactor/.pioreactor/plugins/my_plugin.py to all the workers.

On the the next problem,

/home/pioreactor/.pioreactor/plugins/base.py encountered plugin load error: No module named ‘pioreactor.background_jobs.dosing_control’

We changed the API for dosing automations a few months back (in the 2024.7.18 release), and, looking back, I think we did a poor job of helping users migrate their existing automation plugins.

Can you share your base.py file here (or at info@pioreactor.com)? I can help migrate it.

(btw this error is technically harmless, and isn’t the source of the problem below)

So the current problem is that when you click Start from the UI to run the morbidostat_lt, it only runs on the leader, and not the workers? If so, can you try running it again, and show me the output of pio logs on the worker?

Pedre91 · September 24, 2024, 2:07pm

Thank you Cameron

This is the content from my base.py file:

# -*- coding: utf-8 -*-
from __future__ import annotations

import time
from contextlib import suppress
from datetime import datetime
from functools import partial
from threading import Thread
from typing import Any
from typing import cast
from typing import Optional

from msgspec.json import decode
from msgspec.json import encode

from pioreactor import exc
from pioreactor import structs
from pioreactor import types as pt
from pioreactor.actions.pump import add_alt_media
from pioreactor.actions.pump import add_media
from pioreactor.actions.pump import remove_waste
from pioreactor.automations import events
from pioreactor.automations.base import AutomationJob
from pioreactor.background_jobs.dosing_control import DosingController
from pioreactor.config import config
from pioreactor.pubsub import QOS
from pioreactor.utils import is_pio_job_running
from pioreactor.utils import local_persistant_storage
from pioreactor.utils import SummableDict
from pioreactor.utils.timing import current_utc_datetime
from pioreactor.utils.timing import RepeatedTimer

In addition, I realized that Temperature automations also just work in leader.

EDIT: This is the log from worker1 after rebooting and starting dosing automation (Chemostat):

2024-09-24T16:12:54+0200 [monitor] NOTICE worker1 is online and ready.
2024-09-24T16:12:54+0200 [monitor] INFO Ready.
2024-09-24T16:12:54+0200 [monitor] DEBUG monitor is blocking until disconnected.
2024-09-24T16:13:05+0200 [monitor] DEBUG Running `JOB_SOURCE=user EXPERIMENT=PRUEBA2 nohup pio run dosing_control --automation-name chemostat --skip-first-run 0 --duration 20 --volume 0.5 >/dev/null 2>&1 &` from monitor job.

CamDavidsonPilon · September 24, 2024, 2:24pm

Pedre91:

2024-09-24T16:13:05+0200 [monitor] DEBUG Running `JOB_SOURCE=user EXPERIMENT=PRUEBA2 nohup pio run dosing_control --automation-name chemostat --skip-first-run 0 --duration 20 --volume 0.5 >/dev/null 2>&1 &` from monitor job.

Hm, I think there are some version mismatches. I say this because dosing_control isn’t a “thing” anymore after version 24.7.18.

Couple checks:

What software version is your leader vs your workers (you should be able to find this information on the Inventory page)?
Can you share the .yaml file you added to /var/www/pioreactorui/contrib/automations/dosing/?

Pedre91 · September 24, 2024, 2:38pm

Mmmmm, that’s could be the problem:

leader: 24.6.10
workers: 24.7.18

morbidostat_lt.yaml

---
display_name: Morbidostat LT
automation_name: morbidostat_lt
description: >

  Modification of morbidostat classic control including a lower threshold.

fields:
  - key: target_normalized_od
    default: 0.2
    unit: AU
    label: Target nOD
    type: numeric
  - key: trigger_od_value
    default: 0.1
    unit: AU
    label: Upper nOD threshold
    type: numeric
  - key: volume
    default: 1.0
    unit: mL
    label: Volume
    type: numeric
  - key: duration
    default: 0.25
    unit: min
    label: Time between check
    disabled: True
    type: numeric

If helpfull: morbidostat_lt.py

# -*- coding: utf-8 -*-
from __future__ import annotations

from pioreactor.automations import events
from pioreactor.automations.dosing.base import DosingAutomationJobContrib
from pioreactor.exc import CalibrationError
from pioreactor.utils import local_persistant_storage


class MorbidostatLT(DosingAutomationJobContrib):
    """Adding a lower threshold to morbidostat classic control """

    automation_name = "morbidostat_lt"
    published_settings = {
        "volume": {"datatype": "float", "settable": True, "unit": "mL"},
        "target_normalized_od": {"datatype": "float", "settable": True, "unit": "AU"},
        "duration": {"datatype": "float", "settable": True, "unit": "min"},
        "trigger_od_value": {"datatype": "float", "settable": True, "unit": "AU"},  # New parameter
    }

    def __init__(self, target_normalized_od: float | str, volume: float | str, trigger_od_value: float | str, **kwargs):
        super(MorbidostatLT, self).__init__(**kwargs)

        with local_persistant_storage("current_pump_calibration") as cache:
            if "media" not in cache:
                raise CalibrationError("Media pump calibration must be performed first.")
            elif "waste" not in cache:
                raise CalibrationError("Waste pump calibration must be performed first.")
            elif "alt_media" not in cache:
                raise CalibrationError("Alt media pump calibration must be performed first.")

        self.target_normalized_od = float(target_normalized_od)
        self.volume = float(volume)
        self.trigger_od_value = float(trigger_od_value)  # Store the trigger OD value
        self.trigger_reached = False  # Boolean to check if the OD has passed the trigger value

    def execute(self) -> events.AutomationEvent:
        # Check if we have a previous OD reading
        if self.previous_normalized_od is None:
            return events.NoEvent("Skip first event to wait for OD readings.")

        # Check if we have surpassed the trigger OD value
        if not self.trigger_reached:
            if self.latest_normalized_od >= self.trigger_od_value:
                self.trigger_reached = True  # Mark that the trigger has been reached
            else:
                return events.NoEvent(f"Waiting for OD to reach the trigger value {self.trigger_od_value:.2f} AU.")

        # Proceed with normal operation only if the trigger has been reached
        if self.trigger_reached:
            if (
                self.latest_normalized_od >= self.target_normalized_od
                and self.latest_normalized_od >= self.previous_normalized_od
            ):
                # If we are above the threshold, and growth rate is greater than dilution rate
                self.execute_io_action(alt_media_ml=self.volume, waste_ml=self.volume)
                return events.AddAltMediaEvent(
                    f"latest OD, {self.latest_normalized_od:.2f} >= Target OD, {self.target_normalized_od:.2f} and Latest OD, {self.latest_normalized_od:.2f} >= Previous OD, {self.previous_normalized_od:.2f}"
                )
            else:
                self.execute_io_action(media_ml=self.volume, waste_ml=self.volume)
                return events.DilutionEvent(
                    f"latest OD, {self.latest_normalized_od:.2f} < Target OD, {self.target_normalized_od:.2f} or Latest OD, {self.latest_normalized_od:.2f} < Previous OD, {self.previous_normalized_od:.2f}"
                )

CamDavidsonPilon · September 24, 2024, 2:44pm

Try to update your cluster to the next version, 24.7.18, at least. Here are our docs on that process. The workers will “update” to 24.7.18, too, but this is harmless and won’t change them.

Pedre91 · September 24, 2024, 3:10pm

Hi again,

I’m trying to update to 24.7.18 via .zip file.

I downloaded the file from , renamed it to release_24.7.18.zip and upload via UI.

and I have got the following log:

2024-09-24T17:00:19+0200 [update_app] DEBUG unzip /tmp/release_24.7.18/wheels_24.7.18.zip -d /tmp/release_24.7.18/wheels
2024-09-24T17:00:19+0200 [update_app] DEBUG unzip:  cannot find or open /tmp/release_24.7.18/wheels_24.7.18.zip, /tmp/release_24.7.18/wheels_24.7.18.zip.zip or /tmp/release_24.7.18/wheels_24.7.18.zip.ZIP.

2024-09-24T17:00:19+0200 [update_app] ERROR Update failed. See logs.

Sorry for so many questions!

CamDavidsonPilon · September 24, 2024, 4:58pm

Sorry for so many questions!

It’s okay! It helps us understand what documentation / processes we are missing.

I downloaded the file from , renamed it to release_24.7.18.zip and upload via UI.

Hm, can you repoint to which file you downloaded? You should download this file: release_24.7.18.zip

Pedre91 · September 25, 2024, 9:35am

Hi Cameron:

I didn’t upload the right file, sorry.

The automation issues were fixed after updating to 24.7.18.

I then updated sequentially to 24.8.22 and 24.9.19.

Everything works fine now

Thanks a lot!