Raspi response during deletion of large timelapse file


#1

I just now visited the Timelapse tab and deleted a large, older file that was there. I recall it being 20MB in size.

I note that having pressed the Trashcan button and confirming, the Raspberry became completely unresponsive while this was occurring, perhaps three to five full minutes it would seem.

The Raspberry Pi 3B was doing a little Conky interface reporting but otherwise, that's it. During this period, I couldn't:

  • resolve "octopi.local"
  • remote into it
  • VNC into it

I folded my laundry, came back and of course the OctoPrint screen then removed the former line entry for the timelapse file and then everything else worked.

I was surprised at his given that this Raspi has four cores. I didn't think anything could block so badly as this on Raspbian.

I worry that deleting a large file during a print job could cause print-related problems, I'd guess.

1.3.10rc2, 0.15.0, macOS 10.14.1

I'm thinking that we're seeing something like this topic under discussion. Perhaps behind the scenes, an unlink is GIL-related...? Honestly, Python isn't my forte.

The OS-level file removal primitives are synchronous on both Unix and Windows, so I think you pretty much have to use a worker thread.


#2

I had just a 7.5 MB timelapse file, but it was deleted in a blink of an eye...
Strange... :thinking:


#3

Could it have been a network/wifi connection issue between your browser and your Pi? I pretty routinely delete lots of large timelapse files (sometimes dozens at a time) and it's always been pretty fast. Deleting files usually takes hardly any cpu time and relatively little io bandwidth since deleting is a simple matter of changing a few thing around in the file system (not actually removing the file). Unless you have some kind of data shredding plugin installed (which would be really cool).

It could always be a python quirk, as you mentioned (I'm not a fan of python either really), but several minutes seems.. suspicious. It certainly could be some odd GIL behavior. Is there anything in the logs?


#4

There's absolutely nothing regarding the two deleted timelapses in the logs. I don't think it had anything to do with wi-fi errors since I've got a pretty good setup here (Orbi room routers pushing about 32Gbps).


#5

@foosel, @Ewald_Ikemann, @FormerLurker

Interestingly, this has now reared its ugly head again but I was smart enough to take some photos.

I was attempting to test the new 1.3.9rc3 release and decided that I might as well do a little testing of the drag-and-drop feature.

Browser: Safari 12.0.1
Finder: Mojave 10.14.1
File type: gcode
File size: 2.5MB

It's been trying to upload for about ten minutes now and is stuck in this mode:

25%20AM

Looks like three of the four Raspberry Pi 3B cores are slammed.

No response from ssh, vnc attempts. Either the serial line is blocked or the browser isn't getting any updates but I'm not seeing the usual temperatures blips on the Terminal tab.

websocket is about the only action I see in the browser's Network tab under Developer (but maybe that's ancient history from ten minutes ago):

Continuing to watch (15 minutes now)...

And now this:

I'm getting slow updates on the Conky desktop. I see now that the spare core is at 39% instead of the earlier 8%/9% I've seen.

"Server is offline" dialog notification now in the browser.

Just trying to wait it out, like before with the "timelapse smash".


#6

Looks like the API isn't responding, the Conky desktop hasn't updated, no action from anything remote. The local keyboard/mouse combination will move the cursor around the screen but it doesn't seem to want to interact at all with the Desktop.

I've finally resorted to power-cycling the printer although I really didn't want to do that.

Checking the octoprint.log from that earlier session:

2018-11-28 10:26:33,778 - octoprint.startup - INFO - ******************************************************************************
2018-11-28 10:26:33,782 - octoprint.startup - INFO - Starting OctoPrint 1.3.10rc3
2018-11-28 10:26:33,785 - octoprint.startup - INFO - ******************************************************************************
2018-11-28 10:26:33,877 - octoprint.startup - INFO - Blacklist processing done, adding 1 blacklisted plugin versions: roomtemp (any)
2018-11-28 10:26:33,969 - octoprint.plugin.core - INFO - Loading plugins from /home/pi/oprint/lib/python2.7/site-packages/octoprint/plugins, /home/pi/.octoprint/plugins and installed plugin packages...
2018-11-28 10:26:35,968 - octoprint.plugin.core - INFO - Plugin CuraEngine (<= 15.04) is disabled.
2018-11-28 10:26:36,048 - octoprint.plugin.core - INFO - Plugin Force Login is disabled.
2018-11-28 10:26:38,438 - octoprint.plugin.core - INFO - Plugin Bed Visualizer (0.1.6) is disabled.
2018-11-28 10:26:38,493 - octoprint.plugin.core - INFO - Plugin Bed Leveling Wizard (0.2.0) is disabled.
2018-11-28 10:26:38,584 - octoprint.plugin.core - INFO - Found 27 plugin(s) providing 23 mixin implementations, 28 hook handlers
2018-11-28 10:26:38,599 - octoprint.server.heartbeat - INFO - Starting server heartbeat, 900.0s interval
2018-11-28 10:26:38,741 - octoprint.server - INFO - Intermediary server started
2018-11-28 10:26:38,741 - octoprint.plugin.core - INFO - Loading plugins from /home/pi/oprint/lib/python2.7/site-packages/octoprint/plugins, /home/pi/.octoprint/plugins and installed plugin packages...
2018-11-28 10:26:39,761 - octoprint.plugin.core - INFO - Found 27 plugin(s) providing 23 mixin implementations, 28 hook handlers
2018-11-28 10:26:39,823 - octoprint.filemanager.storage - INFO - Initializing the file metadata for /home/pi/.octoprint/uploads...
2018-11-28 10:26:40,605 - octoprint.filemanager.storage - INFO - ... file metadata for /home/pi/.octoprint/uploads initialized successfully.
2018-11-28 10:26:40,612 - octoprint.printer.standard - INFO - Using print time estimator provided by PrintTimeGenius
2018-11-28 10:26:40,869 - octoprint.plugins.softwareupdate - INFO - Loaded version cache from disk
2018-11-28 10:26:40,878 - octoprint.plugins.tracking - INFO - Initialized anonymous tracking
2018-11-28 10:26:44,575 - octoprint.util.pip - INFO - Using "/home/pi/oprint/bin/python2 -m pip" as command to invoke pip
2018-11-28 10:26:48,459 - octoprint.util.pip - INFO - Version of pip is 10.0.0
2018-11-28 10:26:48,467 - octoprint.util.pip - INFO - pip installs to /home/pi/oprint/lib/python2.7/site-packages (writable -> yes), --user flag needed -> no, virtual env -> yes
2018-11-28 10:26:48,470 - octoprint.util.pip - INFO - ==> pip ok -> yes
2018-11-28 10:26:48,481 - octoprint.plugin.core - INFO - Initialized 23 plugin implementation(s)
2018-11-28 10:26:48,520 - octoprint.plugins.gcodesystemcommands - INFO - Add command definition OCTO801 = /usr/bin/aplay /home/pi/sounds/r2d2/1.wav &
2018-11-28 10:26:48,537 - octoprint.plugins.gcodesystemcommands - INFO - Add command definition OCTO802 = /usr/bin/aplay /home/pi/sounds/r2d2/2.wav &
2018-11-28 10:26:48,539 - octoprint.plugins.gcodesystemcommands - INFO - Add command definition OCTO803 = /usr/bin/aplay /home/pi/sounds/r2d2/3.wav &
2018-11-28 10:26:48,541 - octoprint.plugins.gcodesystemcommands - INFO - Add command definition OCTO804 = /usr/bin/aplay /home/pi/sounds/r2d2/4.wav &
2018-11-28 10:26:48,545 - octoprint.plugins.gcodesystemcommands - INFO - Add command definition OCTO805 = /usr/bin/aplay /home/pi/sounds/r2d2/5.wav &
2018-11-28 10:26:48,551 - octoprint.plugins.gcodesystemcommands - INFO - Add command definition OCTO806 = /usr/bin/aplay /home/pi/sounds/r2d2/6.wav &
2018-11-28 10:26:48,555 - octoprint.plugins.gcodesystemcommands - INFO - Add command definition OCTO900 = /usr/bin/ssh -t pi@pisense.local "sudo pkill temp-graph.py && sudo /home/pi/scripts/cls.py && sleep 3 && sudo poweroff"
2018-11-28 10:26:48,560 - octoprint.plugins.gcodesystemcommands - INFO - Add command definition OCTO902 = /usr/bin/ssh -t pi@picam1.local "sudo pkill mjpg_streamer && sleep 3 && sudo poweroff"
2018-11-28 10:26:48,593 - octoprint.plugin.core - INFO - 27 plugin(s) registered with the system:
|  Action Command Prompt Support (bundled) = /home/pi/oprint/lib/python2.7/site-packages/octoprint/plugins/action_command_prompt
|  Active Filters Extended (0.0.2) = /home/pi/oprint/local/lib/python2.7/site-packages/octoprint_active_filters_extended
|  Announcement Plugin (bundled) = /home/pi/oprint/lib/python2.7/site-packages/octoprint/plugins/announcements
|  Anonymous Usage Tracking (bundled) = /home/pi/oprint/lib/python2.7/site-packages/octoprint/plugins/tracking
|  Application Keys Plugin (bundled) = /home/pi/oprint/lib/python2.7/site-packages/octoprint/plugins/appkeys
|  Backup & Restore (bundled) = /home/pi/oprint/lib/python2.7/site-packages/octoprint/plugins/backup
| !Bed Leveling Wizard (0.2.0) = /home/pi/oprint/local/lib/python2.7/site-packages/octoprint_bedlevelingwizard
| !Bed Visualizer (0.1.6) = /home/pi/oprint/local/lib/python2.7/site-packages/octoprint_bedlevelvisualizer
|  Core Wizard (bundled) = /home/pi/oprint/lib/python2.7/site-packages/octoprint/plugins/corewizard
| !CuraEngine (<= 15.04) (bundled) = /home/pi/oprint/lib/python2.7/site-packages/octoprint/plugins/cura
|  Discovery (bundled) = /home/pi/oprint/lib/python2.7/site-packages/octoprint/plugins/discovery
|  Dragon Order (0.1.2) = /home/pi/oprint/local/lib/python2.7/site-packages/octoprint_dragon_order
|  EEPROM Marlin Editor Plugin (1.2.1) = /home/pi/oprint/local/lib/python2.7/site-packages/octoprint_eeprom_marlin
|  Firmware Updater (1.0.0) = /home/pi/oprint/local/lib/python2.7/site-packages/octoprint_firmwareupdater
| !Force Login (bundled) = /home/pi/oprint/lib/python2.7/site-packages/octoprint/plugins/forcelogin
|  GCODE System Commands (0.1.1) = /home/pi/oprint/local/lib/python2.7/site-packages/octoprint_gcodesystemcommands
|  GitFiles (1.1.4) = /home/pi/oprint/local/lib/python2.7/site-packages/octoprint_gitfiles
|  Logging (bundled) = /home/pi/oprint/lib/python2.7/site-packages/octoprint/plugins/logging
|  Navbar Temperature Plugin (0.11) = /home/pi/oprint/local/lib/python2.7/site-packages/octoprint_navbartemp
|  Pi Support Plugin (bundled) = /home/pi/oprint/lib/python2.7/site-packages/octoprint/plugins/pi_support
|  Plugin Manager (bundled) = /home/pi/oprint/lib/python2.7/site-packages/octoprint/plugins/pluginmanager
|  Printer Safety Check (bundled) = /home/pi/oprint/lib/python2.7/site-packages/octoprint/plugins/printer_safety_check
|  PrintTimeGenius Plugin (1.3) = /home/pi/oprint/local/lib/python2.7/site-packages/octoprint_PrintTimeGenius
|  Software Update (bundled) = /home/pi/oprint/lib/python2.7/site-packages/octoprint/plugins/softwareupdate
|  System Command Editor (0.3.2) = /home/pi/oprint/local/lib/python2.7/site-packages/octoprint_systemcommandeditor
|  Themeify (1.2.2) = /home/pi/oprint/local/lib/python2.7/site-packages/octoprint_themeify
|  Virtual Printer (bundled) = /home/pi/oprint/lib/python2.7/site-packages/octoprint/plugins/virtual_printer
2018-11-28 10:26:48,831 - octoprint.environment - INFO - Detected environment is Python 2.7.13 under Linux (linux2). Details:
|  hardware:
|    cores: 4
|    freq: 1200.0
|    ram: 918179840
|  os:
|    id: linux
|    platform: linux2
|  plugins:
|    pi_support:
|      model: Raspberry Pi 3 Model B Rev 1.2
|      octopi_version: 0.15.0
|  python:
|    pip: 10.0.0
|    version: 2.7.13
|    virtualenv: /home/pi/oprint
2018-11-28 10:26:48,891 - octoprint.server - INFO - Reset webasset folder /home/pi/.octoprint/generated/webassets...
2018-11-28 10:26:48,923 - octoprint.server - INFO - Reset webasset folder /home/pi/.octoprint/generated/.webassets-cache...
2018-11-28 10:26:49,902 - octoprint.server - INFO - Shutting down intermediary server...
2018-11-28 10:26:50,375 - octoprint.server - INFO - Intermediary server shut down
2018-11-28 10:26:50,383 - octoprint.events - INFO - Processing startup event, this is our first event
2018-11-28 10:26:50,385 - octoprint.events - INFO - Adding 0 events to queue that were held back before startup event
2018-11-28 10:26:50,387 - octoprint.filemanager - INFO - Adding backlog items from all storage types to analysis queue...
2018-11-28 10:26:50,487 - octoprint.filemanager - INFO - Added 0 items from storage type "local" to analysis queue
2018-11-28 10:26:50,608 - octoprint.util.comm - INFO - Changing monitoring state from "Offline" to "Opening serial port"
2018-11-28 10:26:50,617 - octoprint.util.comm - INFO - Changing monitoring state from "Opening serial port" to "Connecting"
2018-11-28 10:26:50,623 - octoprint.util.comm - INFO - M110 detected, setting current line number to 0
2018-11-28 10:26:50,653 - octoprint.plugins.discovery - INFO - Registered 'OctoPrint instance "charming-pascal"' for _http._tcp
2018-11-28 10:26:50,669 - octoprint.plugins.discovery - INFO - Registered 'OctoPrint instance "charming-pascal"' for _octoprint._tcp
2018-11-28 10:26:50,678 - octoprint.plugins.discovery - INFO - Registered OctoPrint instance "charming-pascal" for SSDP
2018-11-28 10:26:50,768 - octoprint.server - INFO - Listening on http://127.0.0.1:5000
2018-11-28 10:26:50,782 - octoprint.plugins.navbartemp - INFO - Broadcom detected
2018-11-28 10:26:50,785 - octoprint.plugins.navbartemp - INFO - Checking SoC internal temperature
2018-11-28 10:26:50,952 - octoprint.plugins.pluginmanager - INFO - Loaded plugin repository data from disk, was still valid
2018-11-28 10:26:51,616 - octoprint.util.comm - INFO - M110 detected, setting current line number to 0
2018-11-28 10:26:51,698 - octoprint.plugins.pluginmanager - INFO - Loaded notice data from disk, was still valid
2018-11-28 10:26:53,242 - octoprint.util.pip - INFO - Using "/home/pi/oprint/bin/python2 -m pip" as command to invoke pip
2018-11-28 10:26:53,245 - octoprint.util.pip - INFO - pip installs to /home/pi/oprint/lib/python2.7/site-packages (writable -> yes), --user flag needed -> no, virtual env -> yes
2018-11-28 10:26:53,246 - octoprint.util.pip - INFO - ==> pip ok -> yes
2018-11-28 10:26:53,260 - octoprint.plugins.softwareupdate - WARNING - Software update hook systemcommandeditor provides check for itself but that was already registered by eeprom_marlin - overwriting that third party registration now!

There appears to be nothing amiss. Reviewing the logs with their timestamps:

-rw-r--r-- 1 pi pi 269435 Nov 28 11:22 octoprint.log
-rw-r--r-- 1 pi pi 110615 Nov 28 10:30 plugin_firmwareupdater_console.log
-rw-r--r-- 1 pi pi  53589 Nov 28 10:24 plugin_PrintTimeGenius_engine.log
-rw-r--r-- 1 pi pi 314694 Nov 28 10:30 plugin_softwareupdate_console.log

The 1.3.9rc3 install seems happy. From the PrintTimeGenius log, it looks like it completed its re-analysis of the existing three files at 10:24am but it has no knowledge of the attempted new upload. The firmwareupdater log appears to be a dupe of the softwareupdate log, to be honest.

There is no new file in uploads. df shows plenty of room.

Reloading the page:

04%20AM

Conky is as happy as a clam now:

35%20AM


#7

From the /var/log/syslog at around the time-of-death I would have expected to have seen something regarding the power cycle but all I'm seeing is a time change notification: Nov 28 11:19:46 charming-pascal systemd[689]: Time has been changed.

Nov 28 10:26:47 charming-pascal systemd[689]: Starting Virtual filesystem service - Media Transfer Protocol monitor...
Nov 28 10:26:48 charming-pascal dbus-daemon[708]: Successfully activated service 'org.gtk.vfs.MTPVolumeMonitor'
Nov 28 10:26:48 charming-pascal systemd[689]: Started Virtual filesystem service - Media Transfer Protocol monitor.
Nov 28 11:19:46 charming-pascal systemd[689]: Time has been changed
Nov 28 11:19:46 charming-pascal systemd-timesyncd[275]: Synchronized to time server 45.33.106.180:123 (2.debian.pool.ntp.org).
Nov 28 11:19:46 charming-pascal systemd[1]: Time has been changed
Nov 28 11:19:46 charming-pascal systemd[1]: apt-daily.timer: Adding 5h 23min 53.226410s random time.
Nov 28 11:19:46 charming-pascal systemd[1]: apt-daily-upgrade.timer: Adding 10min 43.674249s random time.
Nov 28 11:19:53 charming-pascal systemd[1]: Started Session c2 of user pi.
Nov 28 11:20:08 charming-pascal kernel: [   59.623409] w1_master_driver w1_bus_master1: Attaching one wire slave 00.800000000000 crc 8c
Nov 28 11:20:08 charming-pascal kernel: [   59.626928] w1_master_driver w1_bus_master1: Family 0 for 00.800000000000.8c is not registered.

Grasping at straws, I've just now edited the /etc/haproxy/haproxy.cfg file's Global section, adding:

tune.ssl.default-dh-param 2048

...per these earlier warnings in its log:

Nov 28 10:26:37 charming-pascal haproxy-systemd-wrapper[596]: [WARNING] 331/102637 (597) : Setting tune.ssl.default-dh-param to 1024 by default, if your workload permits it you should set it to at least 2048. Please set a value >= 1024 to make this warning disappear.

#8

To be dutiful, I'm including a screencap of htop in action, no connected browser.

I kill'd conky, my own octo-pause service, did a sudo service webcamd stop and repeated the experiment and it immediately finished as you'd expect. It feels like I'm overloading the Raspi, to be honest but I would need to do more troubleshooting. The earlier htop memory snapshot doesn't look too bad, though. I'm pretty confident about my own code but it's possible that the following might happen:

  • OctoPrint gets busy doing something like the big timelapse file download or the drag-and-drop upload
  • the conky refreshes are split up into individual queries to the OctoPrint API and they're nicely staggered with prime-number-delays so that they don't normally coincide
  • but if OctoPrint can't respond, maybe it stacks those up and gets into a fatal job queue that will never be fulfilled (guessing)

#9

How does your conky frontend work exactly?


#10

It began with a heavily-modified version of this, as modified for a 2.8" TFT screen. The underlying repository for Conky itself is here.

Two pieces are involved: 1) the main Conky configuration with scripts mostly just the vanilla version of .conkyrc and 2) the part which interfaces with OctoPrint including mods to that .conkyrc. The second one itself uses my octo-client NodeJS interface.

Note that I should adjust the upper limit of the wi-fi bandwidth parameter since I've upgraded my router equipment; it always appears max'd out in that outer ring on the left-hand side of the screen (ignore).


As for how it works, the .conkyrc determines how often each underlying script runs, gathering information and then placing it where it belongs in the interface. Having worked with this long enough, I know that it's important not to choose intervals which are multiples of each other; this will result in multiple scripts running at once and hogging more than one core from the Raspi3.

${offset 5}${voffset -290}${color d7d7d7}${font neuropol:size=18}charming-pascal
${offset 5}${color aad7aa}${voffset -6}${font Nimbus Mono L,size:4,style:bold}Black Pearl ${font neuropol:size=8}v1.0.1
${offset 5}${voffset 5}${font font Nimbus Mono L,size:4,style:bold}${execpi 3000 node ~/sites/octo-conky version}
${offset 5}${voffset 10}${font font Nimbus Mono L,size:4,style:bold}${execpi 10 node ~/sites/octo-conky temp}
]]

Extracting a subset of this, ${execpi 10 node ~/sites/octo-conky temp} will execute the command node ~/sites/octo-conky temp every 10 seconds in this case. (I'm sure that I've modified the code since then so that every interval will be a prime number, though.)

Reading through the Conky objects docs, I see that execpi indicates:

Executes a shell command and displays the output in conky. warning: this takes a lot more resources than other variables. I'd recommend coding wanted behaviour in C and posting a patch. This differs from $exec in that it parses the output of the command, so you can insert things like ${color red}hi!${color} in your script and have it correctly parsed by Conky. Caveats: Conky parses and evaluates the output of $execp every time Conky loops, and then destroys all the objects. If you try to use anything like $execi within an $execp statement, it will functionally run at the same interval that the $execp statement runs, as it is created and destroyed at every interval.

So far, it has behaved quite nicely. I've had it on the printer now for about six months and it's never caused print-related problems. I do not believe it suffers from memory leakage.

I've found the core load section of the screen to be very useful in monitoring development. I have VNC running so that I can remotely see the TFT to watch this. The total load of Conky and its scripts is much smaller than running htop by itself.