Hello,
I am using 10 pioreactors with hardware 40 mL v 1.5 with two designated leaders (no hardware).
I have recently updated my system to version 26.3.0. I have two leaders (leader only systems) operating with an RPI 5 and 10 workers (5 per leader) using RPI 02W. I updated my systems from version 26.2.23 for most of my workers; however I recently set up one of the leaders and therefore it was using 26.3.0. I updated all my systems via internet through the UI.
After updating my system all my workers are disconnecting from their respective leaders (for both leaders). By disconnecting I mean a combination of jobs being “lost” and needing to manually be restarted through the UI and also occasionally some of my workers will go offline and need to be rebooted through the UI or manually powercycled.
I am running stirring, and heating on 5 of my workers with leader01 and running stirring, heating, and LED control for my other 5 workers with leader02.
When I first updated the system I was getting this error:
Exception on /api/bioreactor/descriptors [GET] Traceback (most recent call last): File “/opt/pioreactor/venv/lib/python3.13/site-packages/pioreactor/whoami.py”, line 225, in _get_pioreactor_model_name result.raise_for_status() ~~~~~~~~~~~~~~~~~~~~~~~^^ File “/opt/pioreactor/venv/lib/python3.13/site-packages/pioreactor/mureq.py”, line 250, in raise_for_status raise HTTPErrorStatus(self.status_code) pioreactor.mureq.HTTPErrorStatus: HTTP response returned error code 404 During handling of the above exception, another exception occurred: Traceback (most recent call last): File “/opt/pioreactor/venv/lib/python3.13/site-packages/flask/app.py”, line 1511, in wsgi_app response = self.full_dispatch_request() File “/opt/pioreactor/venv/lib/python3.13/site-packages/flask/app.py”, line 919, in full_dispatch_request rv = self.handle_user_exception(e) File “/opt/pioreactor/venv/lib/python3.13/site-packages/flask/app.py”, line 917, in full_dispatch_request rv = self.dispatch_request() File “/opt/pioreactor/venv/lib/python3.13/site-packages/flask/app.py”, line 902, in dispatch_request return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args) # type: ignore[no-any-return] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^ File “/opt/pioreactor/venv/lib/python3.13/site-packages/pioreactor/web/api.py”, line 2318, in get_bioreactor_variable_descriptors return attach_cache_control(jsonify(to_builtins(get_bioreactor_descriptors())), max_age=0) ~~~~~~~~~~~~~~~~~~~~~~~~~~^^ File “/opt/pioreactor/venv/lib/python3.13/site-packages/pioreactor/bioreactor.py”, line 76, in get_bioreactor_descriptors default=get_default_bioreactor_value(metadata.key), ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^ File “/opt/pioreactor/venv/lib/python3.13/site-packages/pioreactor/bioreactor.py”, line 60, in get_default_bioreactor_value return validate_bioreactor_value(variable_name, resolved_default) File “/opt/pioreactor/venv/lib/python3.13/site-packages/pioreactor/bioreactor.py”, line 92, in validate_bioreactor_value maximum = get_pioreactor_model().reactor_max_fill_volume_ml ~~~~~~~~~~~~~~~~~~~~^^ File “/opt/pioreactor/venv/lib/python3.13/site-packages/pioreactor/whoami.py”, line 176, in get_pioreactor_model name = _get_pioreactor_model_name(target_unit_name) File “/opt/pioreactor/venv/lib/python3.13/site-packages/pioreactor/whoami.py”, line 230, in _get_pioreactor_model_name raise NoWorkerFoundError(f"Worker {unit_name} is not found.") pioreactor.exc.NoWorkerFoundError: Worker leader01 is not found.
I eventually added my leader as a worker by SSHing into the leader and running commands suggested by gemini and now it shows up in the UI. I did not assign it to an experiment and this seemed to temporarily resolve the issue. I added the version of the leader as 40 mL v 1.5 as a placeholder. The errors then stopped.
I was able to successfully run heating, stirring, and LEDs on my leader02 for 6 hours. Then I started jobs on leader01 for the other 5 pioreactors and all the jobs were lost. I thought that this could be because of issues with both leaders trying to access the same workers. In the config file for leader02 I found that all 10 pioreactors were being called, however leader01 was calling the correct 5. This led to both leaders calling 5 of the same workers. I fixed the config files and used the static IP addresses for everything just to ensure no issues and started another test with 5 workers per leaders. All jobs were lost pretty immediately after.
Not sure what else I can try, any help is greatly appreciated!