Issues connecting to worker after trying to switch wifi

I was trying to switch the wifi of my pioreactor to a different network without reimaging the pioreactor so I used this command in ssh:

sudo nmcli device wifi connect ssid name password ssid password ifname wlan0

I did this to both the leader and the worker what may be causing the issue is I switched wifi before it gave me a confirmation that it had successfully changed wifi since the leader did not connect to the new wifi. If I were to reimage the worker to reconnect it to the leader would that affect any of the data I have?

That command looks correct. Note that there won’t be a confirmation, since the Pi will disconnect from network A and connect to network B, and since you’re PC is still on network A, it has disconnected the SSH session. From your POV, it usually just looks like the console is unresponsive.

If I were to reimage the worker to reconnect it to the leader would that affect any of the data I have?

No, that’s fine. From the leader, just remove that worker, and re-add the worker using the usual add-pioreactor flow. None of the worker data on the leader is deleted.


Could you instead connect to network B, SSH into the worker (who I presume did switch networks), and reconnect to network A?

Okay I was able to connect both back to the original network but when I tried adding the worker back it gave me this error

worker pioreactor e[32m2024-06-14T18:02:03-0400 INFO [add_pioreactor] Adding new pioreactor pr3 to cluster.e[0m e[31m2024-06-14T18:02:32-0400 ERROR [add_pioreactor] + set -e + export LC_ALL=C + LC_ALL=C + SSHPASS=raspberry + PIO_VERSION=1.1 + PIO_MODEL=pioreactor_20ml + HOSTNAME=pr3 + HOSTNAME_local=pr3.local + USERNAME=pioreactor ++ hostname + LEADER_HOSTNAME=pr1 + ssh-keygen -R pr3.local + ssh-keygen -R pr3 ++ getent hosts pr3.local ++ cut '-d ’ -f1 + ssh-keygen -R fe80::3145:ddda:b3ba:926d + N=120 + counter=0 + sshpass -p raspberry ssh pioreactor@pr3.local ‘test -d /home/pioreactor/.pioreactor && echo ‘'‘exists’'’’ Warning: Permanently added ‘pr3.local’ (ED25519) to the list of known hosts. + pio workers discover -t + grep -q pr3 + sshpass -p raspberry ssh-copy-id pioreactor@pr3.local /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: “/home/pioreactor/.ssh/id_rsa.pub” /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /usr/bin/ssh-copy-id: WARNING: All keys were skipped because they already exist on the remote system. (if you think this is a mistake, you may want to use -f option) + UNIT_CONFIG=/home/pioreactor/.pioreactor/config_pr3.ini + rm -f /home/pioreactor/.pioreactor/config_pr3.ini + touch /home/pioreactor/.pioreactor/config_pr3.ini + echo -e ‘# Any settings here are specific to pr3, and override the settings in shared config.ini’ + crudini --set /home/pioreactor/.pioreactor/config_pr3.ini pioreactor version 1.1 --set /home/pioreactor/.pioreactor/config_pr3.ini pioreactor model pioreactor_20ml + ssh-keyscan pr3.local # pr3.local:22 SSH-2.0-OpenSSH_9.2p1 -2+deb12u1 # pr3.local:22 SSH-2.0-OpenSSH_9.2p1 -2+deb12u1 # pr3.local:22 SSH-2.0-OpenSSH_9.2p1 -2+deb12u1 # pr3.local:22 SSH-2.0-OpenSSH_9.2p1 -2+deb12u1 # pr3.local:22 SSH-2.0-OpenSSH_9.2p1 -2+deb12u1 + pios sync-configs --units pr3 --skip-save e[36m2024-06-14T18:02:30-0400 DEBUG [sync_configs] Syncing configs on pr3…e[0m + sleep 1 + N=120 + counter=0 + sshpass -p raspberry ssh pioreactor@pr3.local ‘test -f /home/pioreactor/.pioreactor/config.ini && echo ‘'‘exists’'’’ + ssh pioreactor@pr3.local ‘echo “server pr1.local iburst prefer” | sudo tee -a /etc/chrony/chrony.conf’ tee: /etc/chrony/chrony.conf: No such file or directory e[0m Traceback (most recent call last): File “/usr/local/bin/pio”, line 8, in sys.exit(pio()) ^^^^^ File “/usr/local/lib/python3.11/dist-packages/click/core.py”, line 1157, in call return self.main(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File “/usr/local/lib/python3.11/dist-packages/click/core.py”, line 1078, in main rv = self.invoke(ctx) ^^^^^^^^^^^^^^^^ File “/usr/local/lib/python3.11/dist-packages/click/core.py”, line 1688, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File “/usr/local/lib/python3.11/dist-packages/click/core.py”, line 1688, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File “/usr/local/lib/python3.11/dist-packages/click/core.py”, line 1434, in invoke return ctx.invoke(self.callback, **ctx.params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File “/usr/local/lib/python3.11/dist-packages/click/core.py”, line 783, in invoke return __callback(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File “/usr/local/lib/python3.11/dist-packages/pioreactor/cluster_management/init.py”, line 103, in add_worker raise BashScriptError(res.stderr) pioreactor.exc.BashScriptError: + set -e + export LC_ALL=C + LC_ALL=C + SSHPASS=raspberry + PIO_VERSION=1.1 + PIO_MODEL=pioreactor_20ml + HOSTNAME=pr3 + HOSTNAME_local=pr3.local + USERNAME=pioreactor ++ hostname + LEADER_HOSTNAME=pr1 + ssh-keygen -R pr3.local + ssh-keygen -R pr3 ++ getent hosts pr3.local ++ cut '-d ’ -f1 + ssh-keygen -R fe80::3145:ddda:b3ba:926d + N=120 + counter=0 + sshpass -p raspberry ssh pioreactor@pr3.local ‘test -d /home/pioreactor/.pioreactor && echo ‘'‘exists’'’’ Warning: Permanently added ‘pr3.local’ (ED25519) to the list of known hosts. + pio workers discover -t + grep -q pr3 + sshpass -p raspberry ssh-copy-id pioreactor@pr3.local /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: “/home/pioreactor/.ssh/id_rsa.pub” /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /usr/bin/ssh-copy-id: WARNING: All keys were skipped because they already exist on the remote system. (if you think this is a mistake, you may want to use -f option) + UNIT_CONFIG=/home/pioreactor/.pioreactor/config_pr3.ini + rm -f /home/pioreactor/.pioreactor/config_pr3.ini + touch /home/pioreactor/.pioreactor/config_pr3.ini + echo -e ‘# Any settings here are specific to pr3, and override the settings in shared config.ini’ + crudini --set /home/pioreactor/.pioreactor/config_pr3.ini pioreactor version 1.1 --set /home/pioreactor/.pioreactor/config_pr3.ini pioreactor model pioreactor_20ml + ssh-keyscan pr3.local # pr3.local:22 SSH-2.0-OpenSSH_9.2p1 -2+deb12u1 # pr3.local:22 SSH-2.0-OpenSSH_9.2p1 -2+deb12u1 # pr3.local:22 SSH-2.0-OpenSSH_9.2p1 -2+deb12u1 # pr3.local:22 SSH-2.0-OpenSSH_9.2p1 -2+deb12u1 # pr3.local:22 SSH-2.0-OpenSSH_9.2p1 -2+deb12u1 + pios sync-configs --units pr3 --skip-save e[36m2024-06-14T18:02:30-0400 DEBUG [sync_configs] Syncing configs on pr3…e[0m + sleep 1 + N=120 + counter=0 + sshpass -p raspberry ssh pioreactor@pr3.local ‘test -f /home/pioreactor/.pioreactor/config.ini && echo ‘'‘exists’'’’ + ssh pioreactor@pr3.local ‘echo “server pr1.local iburst prefer” | sudo tee -a /etc/chrony/chrony.conf’ tee: /etc/chrony/chrony.conf: No such file or directory

hmhmhm the file chrony.conf should be present in the latest worker image …

It’s not that important. To unblock you, you can comment out the line:

sudo nano /usr/local/bin/add_new_pioreactor_worker_from_leader.sh

And find the line near the bottom that looks like

ssh "$USERNAME"@"$HOSTNAME_local" "echo \"server $LEADER_HOSTNAME.local ibur

Just add a # infront:

# ssh "$USERNAME"@"$HOSTNAME_local" "echo \"server $LEADER_HOSTNAME.local ibur

Save and exit and try again.

I’ll investigate what’s up with this file not being present.

I found this near the bottom is this what you are talking about?
Capture

No no, scroll way past that. Use your keyboard arrow keys to navigate.

Okay I found it and put the # infront but it gave me this error:

hm, anything below that message, or is that all?

Sorry about this @kradical - this should be easier. We’ll reevaluate this workflow and look for improvements next week.

It has this below it:


It looks like the suggestion above didn’t take. It’s still trying to execute the chrony line. Can you try Issues connecting to worker after trying to switch wifi - #4 by CamDavidsonPilon again and confirm it was saved? (crtl-x to quit, and hit y when prompted to save)

I just realized I put commented out the wrong part sorry for the confusion. It successfully added the pioreactor to the cluster

1 Like

So I was able to get the worker back in the leader’s inventory but looking at it today it status remains “waiting for information” and I cant get it to run commands via the UI

Hm, I’m guessing you’ve tried rebooting it? It sounds like an MQTT issue. Can you try SSHing into the worker and see if there’s any error logs in pio logs?

(This is unique to some clusters who use VPNs or multiple networks, maybe not yours: do you need to update the [mqtt] broker_address line in the worker’s config.ini?)

I am able to SSH into it but it is either taking a really long time to load or it isn’t working. How do I update the [mqtt] broker_address?

  1. like commands / typing take a long time to execute, like pio logs?
  2. Is this a Raspberry Pi Zero (not Zero 2)?
  3. Try the following on that worker:
    pio kill --name monitor && pio run monitor
    
    Any errors?

Updating the mqtt broker_address is done in the unit-specific config.ini, but normally you don’t need to do this unless you’ve had to do it before for other workers. Below is an example in my cluster (which uses a VPN).

  1. I am able to type the command normally but the result of the command isn’t appearing for the pio logs command.
  2. This unit is using pi 3 model A
  3. This is what I got back from that command
    image

I haven’t updated the mqtt broker in any of these before.

Are you using the latest worker image? Like you did you grab the latest from here?

A quick check: try pio version on that worker, and should see a version that is like 24.5.x or 24.6.x.

The chrony issue, the mqtt issue, and the pio kill --name issue all suggest it’s an older worker image.

I have an older image from a couple months ago. The version is 24.3.4.

I don’t think we can promise any kinds of leader-compatibility with older worker images, unfortunately. Our general model is that users will have all Pioreactors in their cluster be on nearly the same software version.

Try the latest worker image (downloadable from the link above) in the worker - I think you’ll get the expected results from there.

1 Like