Starting experiments through Terminal

I tried starting an experiment last night, but when I checked on it several hours later, I noticed it had stopped. I think I ran it with nohup like this:
$ nohup python /home/pioreactor/salt_temp_cycle_experiment.py > stc_log.out %

I was pretty tired, so I just restarted the experiment from terminal on a different computer and went to bed, and I’m trying to piece together what happened just now.

Around ~8:04 PM last night, I ran started the experiment via nohup python /home/pioreactor/salt_temp_cycle_experiment.py > stc_log.out % through terminal on my laptop. I verified this was the command by viewing a timestamped command history from the laptop terminal, and confirmed the time by looking at the stc_log.out log file. [Note - All timestamps in log files are +8 hours ahead of my timezone and I am converting the times.] Everything seemed to be working properly.

At 10:21 PM last night, the experiment exited due to a signal Hangup. Here is the relevant portion from the log file. I cleaned up some of the excerpt, but there are also some weird artifacts in the log file. I am not sure what caused them and I left them in.

Disconnect caused by [36m2023-01-11T06:21:00+0000 DEBUG [app] [temperature_control] Exiting caused by signal Hangup.

stc_log.out excerpt

e[32m2023-01-11T06:04:19+0000 INFO [app] [add_media] 0.5mLe[0m
e[36m2023-01-11T06:04:20+0000 DEBUG [app] [add_media] Initialized GPIO-13 using hardware-timing, initial frequency = 200.0 hz.e[0m
e[36m2023-01-11T06:04:25+0000 DEBUG [app] [add_media] Cleaned up GPIO-13.e[0m
e[36m2023-01-11T06:04:25+0000 DEBUG [app] [add_media] Cleaned up GPIO-13.e[0m
e[32m2023-01-11T06:04:31+0000 INFO [app] [remove_waste] 0.5mLe[0m

e[36m2023-01-11T06:18:36+0000 DEBUG [app] [temperature_control] features={‘previous_heater_dc’: 0, ‘room_temp’: 22.0, ‘time_series_of_temp’: […]}e[0m
e[36m2023-01-11T06:21:00+0000 DEBUG [app] [temperature_control] Exiting caused by signal Hangup.e[0m
e[36m2023-01-11T06:21:02+0000 DEBUG [app] [PWM-18] Cleaned up GPIO-18.e[0m
e[36m2023-01-11T06:21:03+0000 DEBUG [app] [temperature_automation] Disconnected.e[0m
e[36m2023-01-11T06:21:04+0000 DEBUG [app] [temperature_automation] Disconnected successfully from MQTT.e[0m
e[32m2023-01-11T06:21:05+0000 INFO [app] [temperature_control] Disconnected.e[0m
e[36m2023-01-11T06:21:06+0000 DEBUG [app] [temperature_control] Disconnected successfully from MQTT.e[0m
e[36m2023-01-11T06:21:07+0000 DEBUG [app] [dosing_control] Exiting caused by Python atexit.e[0m
e[36m2023-01-11T06:21:07+0000 DEBUG [app] [dosing_automation] Disconnected.e[0m
e[36m2023-01-11T06:21:07+0000 DEBUG [app] [dosing_automation] Disconnected successfully from MQTT.e[0m
e[32m2023-01-11T06:21:08+0000 INFO [app] [dosing_control] Disconnected.e[0m
e[36m2023-01-11T06:21:09+0000 DEBUG [app] [dosing_control] Disconnected successfully from MQTT.e[0m

Right at midnight, as I was about to go to bed, I used my phone to check the experiment via the web GUI and saw it had stopped. I went to my desktop, opened a terminal via SSH, and started the experiment again via nohup python /home/pioreactor/salt_temp_cycle_experiment.py > stc_2_logs.out &, this time outputting to a new log file. The experiment has been running fine since then, but I also haven’t shut off this computer since then and I have kept the terminal window open.

Since I started the experiment on my laptop terminal window at 8:04 PM, and it disconnected at 10:21 PM, there are three possible possible causes that come to mind. The laptop enters sleep mode after an hour of disuse, so maybe I last checked my laptop around 9:20 PM last night and it entered sleep mode at 10:21 PM. Alternatively, I could also have closed the terminal window sometime last night sometime after starting the experiment (maybe at 10:21?). I don’t remember doing this, but that was one possible cause I could think of. Lastly, sometimes my house has issues with the internet connection. The WiFi network will still be up and running, but devices connected to it won’t have access to the internet, and sometimes my roommates will restart it. I don’t think this happened last night, but I just realized that this could be an issue at some point in the future (and possible cause a signal hangup?).

How can I prevent this issue from occurring again? I had wanted my program to run in the background, and continue running even after exiting the terminal or shutting off my computer.

On a related topic, is it possible to attach a monitor and keyboard directly to the Pioreactor and use terminal without having to resort to SSH from another computer? I know it is possible with certain raspberry pi operating systems, but I am unsure how the pioreactor software might affect that capability. When I was installed the pioreactor software by flashing the software image onto the raspberry pi SD card, I had assumed I was installing a custom operating system onto the raspberry pi and that it wouldn’t have the functionality of a full raspberry pi operating system. Now that I have a bit more familiarity with using a raspberry, I am starting to suspect that isn’t the case. Regardless, one time I did try plugging in an old monitor to my pi but couldn’t get it to work (it would only display a black and white screen with a bunch of jibberish/random characters and blank spaces). I think it may have also given me some sort of error message.

Is there any special way to connect my pi to an external monitor and keyboard (can I just plug them in and have it run)? Given my last experience with plugging it in, I don’t want to accidentally mess something up while I currently have an experiment running.

  • I think those artifacts in your log file are the “colors” from the normally logged stdout. After some quick googling, I don’t know how to remove them yet. We do capture all logs as plain text in the background too: you can do pio logs -n 100 on the command line to see the most recent 100 log lines, for example.
  • Literally nohup means “no hangup” - so how weird is it that you get exactly that! I’d like you to try the following:
    nohup python your_script.py >/dev/null 2>&1 &
    
    Since we capture logs already, you won’t lose any information. My theory is that it’s the redirect that is causing problems ¯\_(ツ)_/¯
    Just so you feel confident, we’ve run jobs for weeks without interruption, so it’s possible, just sometimes takes the right magical invocation.
  • Wifi going down should be okay, mostly since everything is local on one Raspberry Pi.
  • Attaching a monitor is not possible yet. You’re correct that you installed a modified Raspberry Pi OS - we’ve taken the OS Lite image and installed our custom software on this. But the OS Lite image doesn’t have the support for desktop use. It’s possible to install that support on the OS now, but I don’t know how, and it would be high risk to do so. In the future though, we can start with the Raspberry Pi OS with desktop image so plugging in a monitor and keyboard works.

That’s good to know.

Update regarding the signal hangup issue.

I had been running my experiment for ~1-2 days through $ nohup python /home/pioreactor/my_script.py > stc_2_logs.out & via a terminal on my desktop and left it running overnight. I tried closing the terminal yesterday morning and the experiment shut off again due to a signal hangup. I was able to re-start my experiment from approximately where I left off with a couple quick edits to the script, and this time I started it with nohup python /home/pioreactor/my_script.py >/dev/null 2>&1 & and I will let you know what happens when I turn off terminal/the pc next.

I won’t have access to my pioreactor for the next few days, but I will be sure to update you when I know more.

Also, two questions. First, why did you have me write the output log to /dev/null? Is it a typical/standard-convention log output directory, or is there some sort of troubleshooting benefit to this directory? Second, what does the ‘2>&1’ do?

>/dev/null 2>&1

this is just a way to push all output (ex: logs, and errors at the Python & linux level) to be redirected to nothing.

I suggest trying this because internally we do this for our (stable) long-running processes, so I want to eliminate possibilities for why you are seeing early exits.

It took me a bit of time to get around to trying this out, but it is still not working for me. Let me know what logs you would like to see.

This morning, I tried running nohup python my_script.py >/dev/null 2>&1 & a couple times and then closing the command prompt afterwards. The pioreactor automations would disconnect immediately afterwards.

08:24:50 pioreactor1 dosing control Disconnected.
08:24:46 pioreactor1 temperature control Disconnected.
08:24:39 pioreactor1 dosing automation NoEvent
08:24:35 pioreactor1 temperature control Ready.
08:24:35 pioreactor1 temperature control Starting thermostat(target_temperature=30).
08:24:34 pioreactor1 dosing control Ready.
08:24:34 pioreactor1 dosing control Starting silent(duration=None).
08:23:38 pioreactor1 dosing control Disconnected.
08:23:35 pioreactor1 temperature control Disconnected.
08:23:20 pioreactor1 dosing automation NoEvent
08:23:16 pioreactor1 temperature control Ready.
08:23:16 pioreactor1 temperature control Starting thermostat(target_temperature=30).
08:23:15 pioreactor1 dosing control Ready.
08:23:15 pioreactor1 dosing control Starting silent(duration=None).
08:23:10 pioreactor1 dosing control Disconnected.
08:23:08 pioreactor1 od reading Disconnected.
08:23:08 pioreactor1 growth rate calculating Disconnected.
08:21:35 pioreactor1 dosing automation DilutionEvent: exchanged 0.3mL
08:21:32 pioreactor1 remove waste 0.6mL

Hm, here’s a few more things to try (based of some readings here):

  • Try closing the terminal with exit on the command line - my thoughts are that the SHH connection is not being cleaned up correctly, and this is causing problems.
  • Try python my_script.py >/dev/null 2>&1 & disown (note the disown, and no nohup)

Are you connecting using PuTTy?

I’m not sure what PuTTy is.

I’m just using Windows default Command Prompt to connect (via ssh pioreactor@pioreactor.local). I have also used Windows PowerShell to transfer files to/from the Raspberry Pi via the scp command. Is PuTTy something I should consider trying?

I just ran the python my_script.py >/dev/null 2>&1 & disown and it appears to be working. Here’s what I did.

  1. Open up Command Prompt on Windows 10. SSH into Raspberry.
  2. Enter python my_script.py >/dev/null 2>&1 & disown .The script started normally. I checked it was running with ps aux | grep python and saw my script name in the process list.
  3. I quit out of Command Prompt using the ‘X’ button in the top right.
  4. I waited several minutes and then logged back in via step 1.
  5. I confirmed that the script was still running using ps aux | grep python and then also looked in the pioreactor web-GUI recent log and confirmed no disconnect messages.

Then, I killed the process to try out your first suggestion. kill [PID]

  1. Using my already open command prompt/SSH Terminal, I entered nohup python my_script.py >/dev/null 2>&1 &
  2. I confirmed theprocess was running via ps aux | grep python
  3. I quit using exit
  4. I got the message “logout\n Connection to pioreactor1.local closed.”
  5. No disconnect message in the web GUI. Logging back in via SSH, ps aux | grep python shows the script is still running.

It looks like the disconnect message is caused when exiting Terminal via the “X” button in the top right.

Okay good to know! This may be a windows command prompt specific issue. I’ll add some documentation on our Docs pages for how to best run a long-running script. Thanks for your testing!