Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsubscribe and Signoff Requests Cause Sympa Service Crash in 6.2.68 #1359

Closed
iulistadmin opened this issue Mar 9, 2022 · 41 comments
Closed
Labels
Milestone

Comments

@iulistadmin
Copy link

iulistadmin commented Mar 9, 2022

When Sympa receive an unsubscribe command in the subject of an email, it causes the service to crash. The unsubscribe request is held in the spool directory but the member is successfully unsubscribed from the list, despite an error in the log to the contrary. Other requests continue to be received but held until the unsubscribe request is removed from the spool and Sympa services are restarted.

Version

6.4.68

Installation method

RPMs via the EPEL repo for CentOS 7

Expected behavior

The unsubscribe requests should be processed without causing a service disruption.

Actual behavior

When Sympa receives an email with an unsubscribe command in the subject (e.g., SIG|SIGNOFF|UNS|UNSUBSCRIBE) the service will crash. Log data similar to the following is generated.

Mar  9 09:18:07 iulist-test-app sympa_msg[30606]: notice Sympa::Spindle::ProcessIncoming::_twist() Processing Sympa::Message <[email protected]>; [email protected]; [email protected]; [email protected]
Mar  9 09:18:22 iulist-test-app sympa_msg[30606]: notice Sympa::Request::Message::__parse() Parsing: SIGNOFF tsilver-test3-l
Mar  9 09:18:22 iulist-test-app sympa_msg[30606]: notice Sympa::Spindle::ProcessMessage::_twist() Processing Sympa::Request <action=signoff;[email protected];[email protected]>
Mar  9 09:18:22 iulist-test-app sympa_msg[30606]: err main::#243 > Sympa::Spindle::spin#95 > Sympa::Spindle::DoCommand::_twist#120 > Sympa::Spindle::spin#95 > Sympa::Request::Handler::signoff::_twist#74 > Sympa::List::delete_list_member#1561 > Sympa::List::insert_delete_exclusion#1887 > Sympa::List::get_list_member#2094 > Sympa::Database::do_prepared_query#360 > Sympa::DatabaseDriver::MySQL::connect#58 > Sympa::Database::connect#113 Transaction on database Sympa::DatabaseDriver::MySQL <db_host=iulist-test-db.uits.iu.edu;db_name=sympa;db_port=3306;db_user=sympa> was aborted: MySQL server has gone away
Mar  9 09:18:22 iulist-test-app sympa_msg[30606]: err main::#243 > Sympa::Spindle::spin#95 > Sympa::Spindle::DoCommand::_twist#120 > Sympa::Spindle::spin#95 > Sympa::Request::Handler::signoff::_twist#74 > Sympa::List::delete_list_member#1561 > Sympa::List::insert_delete_exclusion#1887 > Sympa::List::get_list_member#2094 > Sympa::Database::do_prepared_query#361 Unable to get a handle to sympa database
Mar  9 09:18:22 iulist-test-app sympa_msg[30606]: err main::#243 > Sympa::Spindle::spin#95 > Sympa::Spindle::DoCommand::_twist#120 > Sympa::Spindle::spin#95 > Sympa::Request::Handler::signoff::_twist#74 > Sympa::List::delete_list_member#1561 > Sympa::List::insert_delete_exclusion#1887 > Sympa::List::get_list_member#2109 Unable to gather information for user: [email protected]
Mar  9 09:18:22 iulist-test-app sympa_msg[30606]: err main::#243 > Sympa::Spindle::spin#95 > Sympa::Spindle::DoCommand::_twist#120 > Sympa::Spindle::spin#95 > Sympa::Request::Handler::signoff::_twist#74 > Sympa::List::delete_list_member#1621 > Sympa::Database::commit#439 DIED: bug in logic. Ask developer at /usr/share/sympa/lib/Sympa/Database.pm line 439.
Mar  9 09:18:22 iulist-test-app task_manager[30610]: notice main::sigterm() Signal TERM received, still processing current task
Mar  9 09:18:22 iulist-test-app task_manager[30610]: notice main:: Task_Manager exited normally due to signal
Mar  9 09:18:22 iulist-test-app bulk[30616]: notice main::sigterm() Signal TERM received, still processing current task
Mar  9 09:18:22 iulist-test-app bulk[30616]: notice main:: Bulk exited normally due to signal
Mar  9 09:18:22 iulist-test-app bounced[30617]: notice main::sigterm() Signal TERM received, still processing current task
Mar  9 09:18:22 iulist-test-app bounced[30617]: notice main:: Bounced exited normally due to signal
Mar  9 09:18:22 iulist-test-app archived[30615]: notice main::sigterm() Signal TERM received, still processing current task
Mar  9 09:18:22 iulist-test-app archived[30615]: notice main:: Archived exited normally due to signal

Steps to reproduce

Send an email to the command processing address with an unsubscribe command in the subject (e.g., SIG|SIGNOFF|UNS|UNSUBSCRIBE ) and Sympa will terminate. This has been replicated from several clients and online email services (i.e., Gmail, Yahoo! Mail, Xfinity).

Additional information

Sending to the LISTNAME-unsubscribe@domain address does NOT cause a service crash. Sympa will consistently process these requests successfully.

@ikedas
Copy link
Member

ikedas commented Mar 11, 2022

@iulistadmin , could youy please apply this patch and check if the problem will be solvbed?

@iulistadmin
Copy link
Author

iulistadmin commented Mar 14, 2022 via email

@racke
Copy link
Contributor

racke commented Mar 14, 2022

I suggest to find out why the database connection is terminated, so please check MySQL logs.

@iulistadmin
Copy link
Author

iulistadmin commented Mar 14, 2022 via email

@racke
Copy link
Contributor

racke commented Mar 14, 2022

Sympa is going to send a similar query/transaction for each unsubscribe request. So in case there is a bug in Sympa, it should be reproducible. As you say it happens not always in production. My opinion is that you exhaust some MySQL or system resource.

@ikedas
Copy link
Member

ikedas commented Mar 14, 2022

Still crashing. Let us know if we can provide further details to help resolve this issue.
Sympa::List::delete_list_member#1621 > Sympa::Database::commit#453 DIED: bug in logic. Ask developer at /usr/share/sympa/lib/Sympa/Database.pm line 453.

@iulistadmin , please show the log from when Sympa received the message to when Sympa crashed.

@racke
Copy link
Contributor

racke commented Mar 15, 2022

A sample file containing the message which crashes Sympa would be helpful too.

@iulistadmin
Copy link
Author

iulistadmin commented Mar 15, 2022 via email

@iulistadmin
Copy link
Author

iulistadmin commented Mar 15, 2022 via email

@igoltz
Copy link

igoltz commented Apr 26, 2022

Hello, we hit the same issue. It was not seen before the 6.2.68 version. We do have separate database (maria db) server, Sympa is build from source.

Sympa crashed while working on a signoff mail:
err main::#243 > Sympa::Spindle::spin#95 > Sympa::Spindle::DoCommand::_twist#120 > Sympa::Spindle::spin#95 > Sympa::Request::Handler::signoff::_twist#74 > Sympa::List::delete_list_member#1621 > Sympa::Database::commit#439 DIED: bug in logic. Ask developer at /usr/share/sympa/lib/Sympa/Database.pm line 439.

And sympa can't be started. There is still a mail in /var/spool/sympa/msg triggering the crash.
I applied the patch but this has no affect.
Here the log with debug level 3, removed a lot of dbug/debug3 lines when doing systemctl start sympa

Apr 26 16:47:09 sympa-server systemd[1]: Starting Sympa mailing list manager...
Apr 26 16:47:09 sympa-server sympa/health_check[1145646]: info Sympa::CLI::check::_run() Configuration file read, default log level 3

Apr 26 16:47:09 sympa-server sympa_msg[1145648]: info main::_load() Configuration file read, default log level 3
conf_table WHERE robot_conf = ? AND label_conf = ?"

Apr 26 16:47:10 sympa-server sympa_msg[1145648]: notice Sympa::Process::daemonize() Starting sympa/msg daemon, PID 1145651

Apr 26 16:47:10 sympa-server sympa_msg[1145651]: notice main:: Sympa/msg 6.2.68 Started

Apr 26 16:47:10 sympa-server sympa_msg[1145651]: notice Sympa::Request::Message::__parse() Parsing: unsub newsletter-schuelerbereich

Apr 26 16:47:10 sympa-server systemd[1]: Started Sympa mailing list manager.

Apr 26 16:47:10 sympa-server sympa_msg[1145651]: notice Sympa::Spindle::ProcessMessage::_twist() Processing Sympa::Request <action=signoff;context=liste@vhost.*.*;email=user@*.*>
Apr 26 16:47:10 sympa-server sympa_msg[1145651]: debug2 Sympa::Scenario::new(Sympa::Scenario, Sympa::List <liste@vhost.*.*>, unsubscribe, ...)
Apr 26 16:47:10 sympa-server sympa_msg[1145651]: debug3 Sympa::search_fullpath(Sympa::List <liste@vhost.*.*>, unsubscribe.open, subdir)
Apr 26 16:47:10 sympa-server sympa_msg[1145651]: debug3 Sympa::get_search_path(Sympa::List <liste@vhost.*.*>, subdir, scenari)
Apr 26 16:47:10 sympa-server sympa_msg[1145651]: debug3 Sympa::search_fullpath() Name: unsubscribe.open; file /usr/share/sympa/default/scenari/unsubscribe.open
Apr 26 16:47:10 sympa-server sympa_msg[1145651]: debug3 Sympa::Scenario::_parse_scenario() (# unsubscribe.open
                                                           title.gettext open
                                                           
                                                           !equal([sender],[email])           smtp,dkim,md5,smime -> request_auth([email])
                                                           true()                             smtp,dkim,md5,smime -> do_it
                                                           , /usr/share/sympa/default/scenari/unsubscribe.open)
Apr 26 16:47:10 sympa-server sympa_msg[1145651]: debug2 Sympa::Scenario::_compile_scenario(Sympa::List <liste@vhost.*.*>, unsubscribe, ...)
Apr 26 16:47:10 sympa-server sympa_msg[1145651]: debug2 Sympa::Scenario::new(Sympa::Scenario, Sympa::List <liste@vhost.*.*>, include, ...)
Apr 26 16:47:10 sympa-server sympa_msg[1145651]: debug3 Sympa::search_fullpath(Sympa::List <liste@vhost.*.*>, include.unsubscribe.header, subdir)
Apr 26 16:47:10 sympa-server sympa_msg[1145651]: debug3 Sympa::get_search_path(Sympa::List <liste@vhost.*.*>, subdir, scenari)
Apr 26 16:47:10 sympa-server sympa_msg[1145651]: debug3 Sympa::Scenario::_compile_scenario() Verify rule !equal([sender],[email]), auth smtp,dkim,md5,smime, action request_auth([email])
Apr 26 16:47:10 sympa-server sympa_msg[1145651]: debug3 Sympa::Scenario::_compile_scenario() Verify rule true(), auth smtp,dkim,md5,smime, action do_it
Apr 26 16:47:10 sympa-server sympa_msg[1145651]: debug2 Sympa::Scenario::authz(Sympa::Scenario <unsubscribe.open;/usr/share/sympa/default/scenari/unsubscribe.open>, smtp, HASH, ...)
Apr 26 16:47:10 sympa-server sympa_msg[1145651]: debug3 Sympa::Scenario::do_equal(Sympa::List <liste@vhost.*.*>,equal,...)
Apr 26 16:47:10 sympa-server sympa_msg[1145651]: debug2 Sympa::List::delete_list_member(Sympa::List <liste@vhost.*.*>, ...)
Apr 26 16:47:10 sympa-server sympa_msg[1145651]: debug2 Sympa::List::insert_delete_exclusion(Sympa::List <liste@vhost.*.*>, user@*.*, insert)
Apr 26 16:47:10 sympa-server sympa_msg[1145651]: debug2 Sympa::List::get_list_member(Sympa::List <liste@vhost.*.*>, user@*.*)
Apr 26 16:47:10 sympa-server sympa_msg[1145651]: debug3 Sympa::Database::do_prepared_query() Will perform query "SELECT bounce_subscriber AS "bounce", bounce_address_subscriber AS "bounce_address", bounce_score_subscriber AS "bounce_score", custom_attribute_subscriber AS "custom_attribute", date_epoch_subscriber AS "date", user_subscriber AS "email", suspend_end_date_subscriber AS "enddate", comment_subscriber AS "gecos", inclusion_subscriber AS "inclusion", inclusion_ext_subscriber AS "inclusion_ext", inclusion_label_subscriber AS "inclusion_label", number_messages_subscriber AS "number_messages", reception_subscriber AS "reception", suspend_start_date_subscriber AS "startdate", subscribed_subscriber AS "subscribed", suspend_subscriber AS "suspend", topics_subscriber AS "topics", update_epoch_subscriber AS "update_date", visibility_subscriber AS "visibility" FROM subscriber_table WHERE user_subscriber = ? AND list_subscriber = ? AND robot_subscriber = ?"
Apr 26 16:47:10 sympa-server sympa_msg[1145651]: debug3 Sympa::Database::do_prepared_query() Did not find prepared statement for SELECT bounce_subscriber AS "bounce", bounce_address_subscriber AS "bounce_address", bounce_score_subscriber AS "bounce_score", custom_attribute_subscriber AS "custom_attribute", date_epoch_subscriber AS "date", user_subscriber AS "email", suspend_end_date_subscriber AS "enddate", comment_subscriber AS "gecos", inclusion_subscriber AS "inclusion", inclusion_ext_subscriber AS "inclusion_ext", inclusion_label_subscriber AS "inclusion_label", number_messages_subscriber AS "number_messages", reception_subscriber AS "reception", suspend_start_date_subscriber AS "startdate", subscribed_subscriber AS "subscribed", suspend_subscriber AS "suspend", topics_subscriber AS "topics", update_epoch_subscriber AS "update_date", visibility_subscriber AS "visibility" FROM subscriber_table WHERE user_subscriber = ? AND list_subscriber = ? AND robot_subscriber = ?. Doing it
Apr 26 16:47:10 sympa-server sympa_msg[1145651]: debug3 Sympa::Database::connect(Sympa::DatabaseDriver::MySQL <db_host=sympa-db-server;db_name=sympa;db_options=mysql_ssl=1;mysql_ssl_ca_file=/etc/my.cnf.d/certificates/CA.crt;mysql_ssl_verify_server_cert=1;db_user=sympa-db-user>)
Apr 26 16:47:10 sympa-server sympa_msg[1145651]: err main::#243 > Sympa::Spindle::spin#95 > Sympa::Spindle::DoCommand::_twist#120 > Sympa::Spindle::spin#95 > Sympa::Request::Handler::signoff::_twist#74 > Sympa::List::delete_list_member#1561 > Sympa::List::insert_delete_exclusion#1887 > Sympa::List::get_list_member#2094 > Sympa::Database::do_prepared_query#360 > Sympa::DatabaseDriver::MySQL::connect#58 > Sympa::Database::connect#113 Transaction on database Sympa::DatabaseDriver::MySQL <db_host=sympa-db-server;db_name=sympa;db_options=mysql_ssl=1;mysql_ssl_ca_file=/etc/my.cnf.d/certificates/CA.crt;mysql_ssl_verify_server_cert=1;db_user=sympa-db-user> was aborted: MySQL server has gone away
Apr 26 16:47:10 sympa-server sympa_msg[1145651]: err main::#243 > Sympa::Spindle::spin#95 > Sympa::Spindle::DoCommand::_twist#120 > Sympa::Spindle::spin#95 > Sympa::Request::Handler::signoff::_twist#74 > Sympa::List::delete_list_member#1561 > Sympa::List::insert_delete_exclusion#1887 > Sympa::List::get_list_member#2094 > Sympa::Database::do_prepared_query#362 Unable to get a handle to sympa database
Apr 26 16:47:10 sympa-server sympa_msg[1145651]: err main::#243 > Sympa::Spindle::spin#95 > Sympa::Spindle::DoCommand::_twist#120 > Sympa::Spindle::spin#95 > Sympa::Request::Handler::signoff::_twist#74 > Sympa::List::delete_list_member#1561 > Sympa::List::insert_delete_exclusion#1887 > Sympa::List::get_list_member#2109 Unable to gather information for user: user@*.*
Apr 26 16:47:10 sympa-server sympa_msg[1145651]: debug3 Sympa::Database::do_prepared_query() Will perform query "DELETE FROM subscriber_table WHERE user_subscriber = ? AND list_subscriber = ? AND robot_subscriber = ?"
Apr 26 16:47:10 sympa-server sympa_msg[1145651]: debug3 Sympa::Database::do_prepared_query() Did not find prepared statement for DELETE FROM subscriber_table WHERE user_subscriber = ? AND list_subscriber = ? AND robot_subscriber = ?. Doing it
Apr 26 16:47:10 sympa-server sympa_msg[1145651]: debug3 Sympa::Database::connect(Sympa::DatabaseDriver::MySQL <db_host=sympa-db-server;db_name=sympa;db_options=mysql_ssl=1;mysql_ssl_ca_file=/etc/my.cnf.d/certificates/CA.crt;mysql_ssl_verify_server_cert=1;db_user=sympa-db-user>)
Apr 26 16:47:10 sympa-server sympa_msg[1145651]: debug2 Sympa::Database::connect() Connected to Database Sympa::DatabaseDriver::MySQL <db_host=sympa-db-server;db_name=sympa;db_options=mysql_ssl=1;mysql_ssl_ca_file=/etc/my.cnf.d/certificates/CA.crt;mysql_ssl_verify_server_cert=1;db_user=sympa-db-user>
Apr 26 16:47:10 sympa-server sympa_msg[1145651]: info Sympa::List::delete_list_member() Unable to remove list member user@*.* from list Sympa::List <liste@vhost.*.*>: Not on list
Apr 26 16:47:10 sympa-server sympa_msg[1145651]: err main::#243 > Sympa::Spindle::spin#95 > Sympa::Spindle::DoCommand::_twist#120 > Sympa::Spindle::spin#95 > Sympa::Request::Handler::signoff::_twist#74 > Sympa::List::delete_list_member#1621 > Sympa::Database::commit#449 DIED: bug in logic. Ask developer at /usr/share/sympa/lib/Sympa/Database.pm line 449.
Apr 26 16:47:10 sympa-server systemd[1]: sympa.service: Main process exited, code=exited, status=255/n/a
Apr 26 16:47:10 sympa-server systemd[1]: sympa.service: Failed with result 'exit-code'.

@ikedas
Copy link
Member

ikedas commented Apr 27, 2022

@igoltz , could you please apply this patch and check if the problem will be solvbed?

@igoltz
Copy link

igoltz commented Apr 27, 2022

@ikedas I already had the patch applied, it has no effect

@ikedas
Copy link
Member

ikedas commented Apr 27, 2022

Just in case, have you restarted all services (Sympa and web interface) after applying the patch?

@igoltz
Copy link

igoltz commented Apr 27, 2022

In the
Disconnected: Transaction (if any) was aborted.
block
if (delete $self->{_sdbTransactionLevel}) evaluates to true. This then leading to the crash
on/during each start of Sympa with the triggering message still in the spool!
No problem with database, gui works, Sympa starts till the spindle spins.

@ikedas
Copy link
Member

ikedas commented Apr 27, 2022

Could you please restart all services (Sympa and web interface) again?

@igoltz
Copy link

igoltz commented Apr 27, 2022

@ikedas
OK, I just restarted wwsymopa. But isn't this a complete different process?

Sympa itself crashes on (or sligthly after) every start with the above error, so in fact it's a fresh start each time.

@ikedas
Copy link
Member

ikedas commented Apr 27, 2022

Could you please let us know these things?

  • The exact version of MySQL (or MariaDB).
  • Database setting of Sympa, i.e. all db_* parameters except db_password in sympa.conf.

@igoltz
Copy link

igoltz commented Apr 27, 2022

MariaDB 10.5.9 (mariadb-10.5.9-1.module_el8.4.0+2357+88cd7ba1.x86_64) on dedicated host.

db_type mysql
db_host	sympa-db-server
db_name sympa
db_user	sympa
db_options      mysql_ssl=1;mysql_ssl_ca_file=/etc/my.cnf.d/certificates/CA.crt;mysql_ssl_verify_server_cert=1

@ikedas
Copy link
Member

ikedas commented Apr 27, 2022

Thank you.

I'll try reproducing the problem using those parameters. I'm sorry but It'll take some time: I'll be in vacation in the next week.

The other questions that come to my mind:

  • Can you try not using mysql_ssl option? Though I'm not sure it is related to the problem.
  • Don't you set up db cluster (e.g. Galera) nor load balancer (e.g. HAProxy) with MariaDB? If you do, it can be the cause.

@ikedas
Copy link
Member

ikedas commented Apr 27, 2022

Addition. Also please check the log of MySQL server to know the reason why the database connection terminated.

@igoltz
Copy link

igoltz commented Apr 27, 2022

@ikedas

  • connection must be TLS, I only may remove strict cert. checking
  • no, it's single node with direct connection. But Galery/maxscale should be no problem

As this problem did not exist before 6.2.68 I now tried to revert the changes since 6.2.66. Replaced

DatabaseDriver.pm
DatabaseManager.pm
Database.pm
DatabaseDriver/MySQL.pm

with the versions taged 6.2.66 and now Sympa starts and processes the queue.
So probably changes in commit 09bcf96 cause the new behavior.

@igoltz
Copy link

igoltz commented Apr 27, 2022

@ikedas
there is no entry in the maria log , no sign that the connection terminated

@ikedas
Copy link
Member

ikedas commented Apr 27, 2022

@ikedas

  • connection must be TLS, I only may remove strict cert. checking
  • no, it's single node with direct connection. But Galery/maxscale should be no problem

As this problem did not exist before 6.2.68 I now tried to revert the changes since 6.2.66. Replaced

DatabaseDriver.pm
DatabaseManager.pm
Database.pm
DatabaseDriver/MySQL.pm

with the versions taged 6.2.66 and now Sympa starts and processes the queue. So probably changes in commit 09bcf96 cause the new behavior.

It's needed for PostgreSQL to work properly.
Anyways the error message "MySQL server has gone away" is generated by mysql client library (here). At least from view of Sympa, connection to db server actually terminates.

@igoltz
Copy link

igoltz commented Apr 27, 2022

@ikedas
Back on 6.2.68 I am able to reproduce it. I did save the messages from the msg spool. Changing the msg id and copy to the spool: 1 causes no problem, 3 do trigger the failure.
Let me check if I find differences in the mails.

@igoltz
Copy link

igoltz commented Apr 27, 2022

@ikedas
Very interesting: If I copy those messages while Sympa is running they are processed fine. If they are placed in the spool and Sympa started afterwards then 3 cause the failure.

I could not spot any obvious differences between good/bad mails yet.

@igoltz
Copy link

igoltz commented Apr 27, 2022

@ikedas
New finding: the mail which triggers not tries to unsubscribe from non-existing list, sender get notified about this.
If I change a "bad" mail so that it unsubscribes from non-existing list then it is processed.
So the difference here is existing vs non-existing lists.

@racke
Copy link
Contributor

racke commented Apr 27, 2022

@ikedas

  • connection must be TLS, I only may remove strict cert. checking
  • no, it's single node with direct connection. But Galery/maxscale should be no problem

As this problem did not exist before 6.2.68 I now tried to revert the changes since 6.2.66. Replaced

DatabaseDriver.pm
DatabaseManager.pm
Database.pm
DatabaseDriver/MySQL.pm

with the versions taged 6.2.66 and now Sympa starts and processes the queue. So probably changes in commit 09bcf96 cause the new behavior.

It's needed for PostgreSQL to work properly. Anyways the error message "MySQL server has gone away" is generated by mysql client library (here). At least from view of Sympa, connection to db server actually terminates.

@igoltz Can you also add versions of MySQL libs + related Perl modules on the Sympa server?

@igoltz
Copy link

igoltz commented Apr 27, 2022

So not finding anything in the database connection, where does the mysql error message come from?
I included an additional log printing the $DBI::errstr before entering the "Disconnected: Transaction (if any) was aborted." block.

--- Database.pm.org	2022-03-16 15:00:05.205332868 +0100
+++ Database.pm	2022-04-27 14:12:31.662306136 +0200
@@ -108,6 +108,10 @@
             $self);
         return 1;
     }
+
+$log->syslog('debug3', 'IGO after ping', $self);
+$log->syslog('debug3', 'IGO msg outside if: Transaction on database %s was aborted: %s', $self, $DBI::errstr);
+
     # Disconnected: Transaction (if any) was aborted.
     if (delete $self->{_sdbTransactionLevel}) {
         $log->syslog('err', 'Transaction on database %s was aborted: %s',
@@ -116,6 +120,7 @@
         return undef;
     }
 
+
     # Do we have required parameters?
     foreach my $param (@{$self->required_parameters}) {
         unless (defined $self->{$param}) {

Trying with BAD and OK message I see that the same $DBI::errstr is printed while processing either messages! Here the log from the good one which is processed fine.

Apr 27 14:16:02 sympa-server sympa_msg[44309]: debug3 Sympa::Database::connect(Sympa::DatabaseDriver::MySQL <db_host=sympa-db-server;db_name=sympa;db_options=mysql_ssl=1;mysql_ssl_ca_file=/etc/my.cnf.d/certificates/CA.crt;mysql_ssl_verify_server_cert=1;db_port=3306;db_user=sympa>)
Apr 27 14:16:02 sympa-server sympa_msg[44309]: debug3 Sympa::Database::connect() IGO after ping
Apr 27 14:16:02 sympa-server sympa_msg[44309]: debug3 Sympa::Database::connect() IGO msg outside if: Transaction on database Sympa::DatabaseDriver::MySQL <db_host=sympa-db-server;db_name=sympa;db_options=mysql_ssl=1;mysql_ssl_ca_file=/etc/my.cnf.d/certificates/CA.crt;mysql_ssl_verify_server_cert=1;db_port=3306;db_user=sympa> was aborted: Lost connection to MySQL server during query
Apr 27 14:16:02 sympa-server sympa_msg[44309]: debug2 Sympa::Database::connect() Connected to Database Sympa::DatabaseDriver::MySQL <db_host=sympa-db-server;db_name=sympa;db_options=mysql_ssl=1;mysql_ssl_ca_file=/etc/my.cnf.d/certificates/CA.crt;mysql_ssl_verify_server_cert=1;db_port=3306;db_user=sympa>
Apr 27 14:16:02 sympa-server sympa_msg[44309]: debug2 Sympa::Message::Template::_new_from_template(Sympa::Message::Template, *, command_report.tt2, user@*.*, HASH)
Apr 27 14:16:02 sympa-server sympa_msg[44309]: debug3 Sympa::get_search_path(*, subdir, mail_tt2)
Apr 27 14:16:02 sympa-server sympa_msg[44309]: debug2 Sympa::Message::new(Sympa::Message::Template, ...)
Apr 27 14:16:02 sympa-server sympa_msg[44309]: notice Sympa::Spindle::ProcessTemplate::_twist() Processing Sympa::Message::Template <sympa@liste@vhost.*.*.1651061763.44309,9399>; envelope_sender=sympa-request@liste@vhost.*.*; message_id=sympa.1651061762.536492.44309.828@liste@vhost.*.*; recipients=user@*.*; sender=sympa@liste@vhost.*.*; template=command_report;
Apr 27 14:16:02 sympa-server sympa_msg[44309]: notice Sympa::Spool::Outgoing::store() Message Sympa::Message::Template <sympa@liste@vhost.*.*.1651061763.44309,9399> is stored into bulk spool as <1.5.1651061763.1651061762.548650.sympa@liste@vhost.*.*_s,44309,7980>

@iulistadmin
Copy link
Author

Been following the latest discussion about this issue. Here are the specifics for our database instance in production (we also have test and dev environments with dev hosting the database on the same server as the app server). We see that when Sympa crashes due to a signoff request, the list does exist and the member is removed, but the message remains in the spool and will prevent Sympa from starting.

MySQL Server version: 5.7.37
Options in Sympa for DB:
db_type mysql
db_name sympa
db_host list-db.uits.indiana.edu
db_port 3306
db_list_cache on

@ikedas
Copy link
Member

ikedas commented Apr 28, 2022

Peoblem is that, when ping() fails (i.e. no db connection), {_sdbTransactionLevel} is already positive (i.e. begin() has already been executed). It looks like that database connection lost (terminated?) between executions of begin() and the next SQL statement.

@ikedas
Copy link
Member

ikedas commented May 9, 2022

@igoltz and @iulistadmin , can you tell us the exact version of DBD-MySQL module (perl-DBD-MySQL, if you are using RPM package) ?

@iulistadmin
Copy link
Author

Name : perl-DBD-MySQL
Arch : x86_64
Version : 4.023
Release : 6.el7
Size : 323 k
Repo : installed

@igoltz
Copy link

igoltz commented May 10, 2022

perlbrew perl-5.30.3 compiled on almalinux 8
modules installed with cpanm, mariadb-devel rpm from alma repo has version 10.5
DBD::Mysql is version 4.050

@ikedas
Copy link
Member

ikedas commented May 16, 2022

@Jylhis, the improvement introduced by #1186 have some problems running on (at least) MySQL/MariaDB. we'd be better to revert it once and rework it in the future (e.g. on separate branch).

@iulistadmin and @igoltz , to revert the changes in 6.2.68, please check this patch. The next release (6.2.70?) will contain this patch.

@ikedas ikedas changed the title Unsubscribe and Signoff Requests Cause Sympa Service Crash in 6.4.68 Unsubscribe and Signoff Requests Cause Sympa Service Crash in 6.2.68 May 16, 2022
@ikedas ikedas added bug and removed question labels May 16, 2022
@iulistadmin
Copy link
Author

@ikedas We applied this to our non-production environment and it indeed resolved our issue. Thank you!

When is 6.2.70 scheduled for (pre)release?

@igoltz
Copy link

igoltz commented May 19, 2022

@ikedas Yes as far as I can reproduce the failure is gone now.

@ikedas
Copy link
Member

ikedas commented May 23, 2022

The changes for #1186 have been reverted. This issue is closed for now.

@ikedas ikedas closed this as completed May 23, 2022
@ikedas ikedas added this to the 6.2.70 milestone Jun 9, 2022
@iulistadmin
Copy link
Author

iulistadmin commented Aug 2, 2022

We applied the patch for this issue Sunday July 10th but on July 15th we experienced another crash again related to a signoff request. The message was removed from the queue and Sympa was started.

The Sympa logs from the event are provided. There only thing in the MySQL logs is “Got timeout reading communication packets”. This error occurs when idle threads are killed by wait_timeout. Does Sympa call mysql_close() in scripts when the connection is no longer needed?

Another Sympa admin stated on the users mailing list,

"If Sympa is going to open connections and use them persistently, then it should probably issue a SET SESSION WAIT_TIMEOUT = 28800 (or some other suitable value) upon opening its connection. On the other hand if it is
intended that Sympa's database connections be transient, then as James notes, when it is done with a connection it has opened it should properly close it, not just abandon it."

Jul 15 02:25:17 list-server sympa_msg[3001]: err main::#243 > Sympa::Spindle::spin#95 > Sympa::Spindle::DoCommand::_twist#120 > Sympa::Spindle::spin#95 > Sympa::Request::Handler::signoff::_twist#74 > Sympa::List::delete_list_member#1561 > Sympa::List::insert_delete_exclusion#1887 > Sympa::List::get_list_member#2094 > Sympa::Database::do_prepared_query#360 > Sympa::DatabaseDriver::MySQL::connect#58 > Sympa::Database::connect#113 Transaction on database Sympa::DatabaseDriver::MySQL <db_host=list-db.uits.indiana.edu;db_name=sympa;db_port=3306;db_user=sympa> was aborted: Turning off AutoCommit failed
Jul 15 02:25:17 list- server sympa_msg[3001]: err main::#243 > Sympa::Spindle::spin#95 > Sympa::Spindle::DoCommand::_twist#120 > Sympa::Spindle::spin#95 > Sympa::Request::Handler::signoff::_twist#74 > Sympa::List::delete_list_member#1561 > Sympa::List::insert_delete_exclusion#1887 > Sympa::List::get_list_member#2094 > Sympa::Database::do_prepared_query#361 Unable to get a handle to sympa database
Jul 15 02:25:17 list-server sympa_msg[3001]: err main::#243 > Sympa::Spindle::spin#95 > Sympa::Spindle::DoCommand::_twist#120 > Sympa::Spindle::spin#95 > Sympa::Request::Handler::signoff::_twist#74 > Sympa::List::delete_list_member#1561 > Sympa::List::insert_delete_exclusion#1887 > Sympa::List::get_list_member#2094 > Sympa::Database::do_prepared_query#360 > Sympa::DatabaseDriver::MySQL::connect#58 > Sympa::Database::connect#113 Transaction on database Sympa::DatabaseDriver::MySQL <db_host=list-db.uits.indiana.edu;db_name=sympa;db_port=3306;db_user=sympa> was aborted: Turning off AutoCommit failed
Jul 15 02:25:17 list-server sympa_msg[3001]: err main::#243 > Sympa::Spindle::spin#95 > Sympa::Spindle::DoCommand::_twist#120 > Sympa::Spindle::spin#95 > Sympa::Request::Handler::signoff::_twist#74 > Sympa::List::delete_list_member#1561 > Sympa::List::insert_delete_exclusion#1887 > Sympa::List::get_list_member#2094 > Sympa::Database::do_prepared_query#361 Unable to get a handle to sympa database
Jul 15 02:25:17 list-server sympa_msg[3001]: err main::#243 > Sympa::Spindle::spin#95 > Sympa::Spindle::DoCommand::_twist#120 > Sympa::Spindle::spin#95 > Sympa::Request::Handler::signoff::_twist#74 > Sympa::List::delete_list_member#1561 > Sympa::List::insert_delete_exclusion#1887 > Sympa::List::get_list_member#2094 > Sympa::Database::do_prepared_query#360 > Sympa::DatabaseDriver::MySQL::connect#58 > Sympa::Database::connect#113 Transaction on database Sympa::DatabaseDriver::MySQL <db_host=list-db.uits.indiana.edu;db_name=sympa;db_port=3306;db_user=sympa> was aborted: Turning off AutoCommit failed
Jul 15 02:25:17 list-server sympa_msg[3001]: err main::#243 > Sympa::Spindle::spin#95 > Sympa::Spindle::DoCommand::_twist#120 > Sympa::Spindle::spin#95 > Sympa::Request::Handler::signoff::_twist#74 > Sympa::List::delete_list_member#1561 > Sympa::List::insert_delete_exclusion#1887 > Sympa::List::get_list_member#2094 > Sympa::Database::do_prepared_query#361 Unable to get a handle to sympa database
Jul 15 02:25:17 list-server sympa_msg[3001]: err main::#243 > Sympa::Spindle::spin#95 > Sympa::Spindle::DoCommand::_twist#120 > Sympa::Spindle::spin#95 > Sympa::Request::Handler::signoff::_twist#74 > Sympa::List::delete_list_member#1561 > Sympa::List::insert_delete_exclusion#1887 > Sympa::List::get_list_member#2109 Unable to gather information for user: [email protected]
Jul 15 02:25:17 list-server sympa_msg[3001]: err main::#243 > Sympa::Spindle::spin#95 > Sympa::Spindle::DoCommand::_twist#120 > Sympa::Spindle::spin#95 > Sympa::Request::Handler::signoff::_twist#74 > Sympa::List::delete_list_member#1561 > Sympa::List::insert_delete_exclusion#1887 > Sympa::List::get_list_member#2109 Unable to gather information for user: [email protected]
Jul 15 02:25:17 list-server sympa_msg[3001]: err main::#243 > Sympa::Spindle::spin#95 > Sympa::Spindle::DoCommand::_twist#120 > Sympa::Spindle::spin#95 > Sympa::Request::Handler::signoff::_twist#74 > Sympa::List::delete_list_member#1621 > Sympa::Database::commit#439 DIED: bug in logic. Ask developer at /usr/share/sympa/lib/Sympa/Database.pm line 439.
Jul 15 02:25:17 list-server sympa_msg[3001]: err main::#243 > Sympa::Spindle::spin#95 > Sympa::Spindle::DoCommand::_twist#120 > Sympa::Spindle::spin#95 > Sympa::Request::Handler::signoff::_twist#74 > Sympa::List::delete_list_member#1621 > Sympa::Database::commit#439 DIED: bug in logic. Ask developer at /usr/share/sympa/lib/Sympa/Database.pm line 439.
Jul 15 02:25:17 list-server task_manager[3004]: notice main::sigterm() Signal TERM received, still processing current task
Jul 15 02:25:17 list-server task_manager[3004]: notice main::sigterm() Signal TERM received, still processing current task
Jul 15 02:25:17 list-server task_manager[3004]: notice main:: Task_Manager exited normally due to signal
Jul 15 02:25:17 list-server task_manager[3004]: notice main:: Task_Manager exited normally due to signal
 
 
2022-07-15T02:23:04.063626Z 33677 [Note] Aborted connection 33677 to db: 'sympa' user: 'sympa' host: 'list-server' (Got timeout reading communication packets)
2022-07-15T02:23:27.924353Z 33678 [Note] Aborted connection 33678 to db: 'sympa' user: 'sympa' host: 'list-server' (Got timeout reading communication packets)
2022-07-15T02:27:15.816937Z 33682 [Note] Aborted connection 33682 to db: 'sympa' user: 'sympa' host: 'list-server' (Got timeout reading communication packets)
2022-07-15T02:28:40.091120Z 33681 [Note] Aborted connection 33681 to db: 'sympa' user: 'sympa' host: 'list-server' (Got timeout reading communication packets)

@ikedas
Copy link
Member

ikedas commented Aug 3, 2022

Hi @iulistadmin ,

To me, it looks like either the patch was not applied at all or the service was not restarted after the patch was applied.

@iulistadmin
Copy link
Author

We have since rebooted the system, so we will allow the system to process requests as normal and monitor for any issues. Thanks!

@iulistadmin
Copy link
Author

To follow-up, so far, so good. No further crashes after restarting services following the patches. Thank you to the development team for resolving this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants