Re: Hub 3 Update
By: Al to deon on Fri Apr 12 2024 05:00 am
Hey Al,
I have noticed here in the last month or two that the net 3 hub gets put on hold because it doesn't answer, or because there is some failure.
I clear those holds periodically and things flow as expected until the next hold comes along.
I just cleared the holds on net 3 a hour ago or so.
OK, there are probably a couple of reasons for this:
* There is major construction going on nearby, and they are constantly taking my internet down for "maintenance" - and its prolonged (usually around 10hrs). (They are rebuilding the rail line near me, and its an 18-24 mth project while they move it above ground.). So I imagine this long outage is probably a primary reason.
(I have a hotspot, which gets traffic when my main cable goes down, so mail still flows, but only outbound from me.)
* I've taken hub down for updates.
* I nightly backup "pauses" the container and backs up the hub, but that should only be a few mins. But that might be happing while there is a session active.
* My IPv4 link goes down (IP6 is much more reliable...)
Tonight I stopped the hub from accepting inbound calls while I cleared the backlog - it made it easier for me to trace a problem in the logs - which is when I noticed the kernel killing the db... ;)
How many failed attempts (and time) before your system puts me on hold?
Sometimes when I watch mailer sessions with hub 3 the session is very slow. This could also be the cause of failures. I don't know why the session progresses slowly. A lack of memory perhaps?
Slow as in there is a delay before there are transfers? binkp by default has a 5 min timeout, hopefully not that slow that it times out?
Outbound mail bundles are built on the fly, and the DB has a lot of mail in it (I've never deleted anything...), but it should be seconds before mail packets are ready, not minutes...
I just looked in the logs for a session tonight, and it looks like 2-5s:
[2024-04-12 22:05:02] production.INFO: PB-:- We have authed these AKAs [21:4/106.0@fsxnet] {"pid":268}
[2024-04-12 22:05:04] production.INFO: MA-:= Got [1] echomails for [21:4/106.0@fsxnet] for sending {"pid":268}
[2024-04-12 22:05:07] production.INFO: IS-:- Sending item [0] (118c0100.pkt) {"pid":268}
[2024-04-12 22:05:07] production.INFO: PB-:= Packet/File [118c0100.pkt], type [4] sent. {"pid":268}
That said, I've noticed the website is slowing down, so I may need to think about better DB indexes and/or deleting some mail.
To be honest, I'm surprised that memory is the issue - docker stats show it using < 200MB of the 512MB that I had assigned to the DB, yet the kernel was killing it (oom-killer). I've doubled it just in case, but I'll need to keep an eye on it.
...лоеп
--- SBBSecho 3.20-Linux
* Origin: I'm playing with ANSI+videotex - wanna play too? (21:2/116)