Monday, October 30, 2006

update | alpha-6.3-1

I've just released v. 6.3-1 and the aforementioned failover scenarios are now handled properly.

ToDo:

  • Isolate the install/config scripts so folks can use the cluster with manual configuration (the way it's done with heartbeat, for instance). Ideally, the code should be forked so bugs are handled independantly.
  • Rewrite the passwordless ssh setup so we use the ssh agent and therefore store the private key with a passphrase (at the end user's choice, of course)
  • Use sudo to run anything that needs root, so the cluster can run with an unprivileged account

Lots of problems, lots of fun!.

yay!

After two months of interrupted hard work (yes, interrupted), I got to handle two error situations almost flawlessly.

Almost :)

The first scenario was a simple mysqld stop on the master, which was properly handled by a remote mysqld start by the slave.

The second scenario was a hard failure on the master, which was properly handled by the slave (it took over the service), with the only added problem that it lost it's original IP address, becoming reachable only though the cluster IP.

Anyway, there's lots of hard work to do towards 1.0, but alpha-7 is much closer now, and we're really closer to saying 'happy clustering' again!.

Monday, October 23, 2006

documentation update draft

I'm working on updating the cluster documentation to reflect the latest changes (in our code and in MySQL).

Here are some general notes.

Replication privileges

In order to allow replication from the slave host (slavehost) we need only to run this sentence on the master host:

GRANT REPLICATION SLAVE ON db.table TO 'replicationuser'@'slavehost'

mysql-ha creates a user with just this privilege, and it's recommended that you don't grant this user any more privileges than those needed (REPLICATION SLAVE on any db.table combination you want to replicate).


mysql-ha limitations and known issues:

  • geographic distribution
Right now the cluster is based on sharing an IP address. This technique works if both master and slave node are on the same physical network. We need to modularize this code so that we can share a network resource instead, this resource being either an IP address, a dynamic DNS entry, etc.
  • remote execution security
Right now, remote script execution is based on passwordless remote ssh. In order to allow this, we set up pubkey/privkey based ssh without a passphrase. This is an obvious security issue. We need to use the ssh agent in order to use a passphrase to protect the private key. This should be provided as an option to the end user.
  • ARP spoofing
The cluster uses ARP spoofing only if the failover can't be forced on the master node. ARP spoofing is generally ignored by routers but is a normal technique used by clusters (heartbeat uses it by default to speed up the propagation of an IP address change). We should allow the customization of this with
three options:
    • no spoofing
    • spoof only when needed (as is done now)
    • spoof always (as heartbeat)
Please note that ARP spoofing is only needed if the cluster uses a shared IP address.

Another issue is that we execute remote commands using the root account. I'm currently working on updating sudo, so we no longer need the root password on the master/slave node and remote commands can be executed by non-privileged accounts.

More info on this later..

Thursday, October 19, 2006

alpha-7 roadmap

alpha-6.2 was the last relase from the 6 series, and completed a wxpython based cluster
configurator.

New releases on this series will just include fixes to existing features.

I'm working hard on alpha-7, fixing issues with the installation procedure, with the ultimate goal of achieving a smooth install and a working takover on FC5.

I noticed that the configuration-wrapper.sh script doesn't always fall back correctly, and there are
probably many more bugs introduced in the recent developments.

However, I hope to make again a usable release (I believe the last was alpha-5, with FC2) for alpha-7 RSN. Like.. next week or something :)

Roadmap:
- update the existing docs
- create documentation for a manual installation
- modularize the network resource sharing code, so that instead of using just a shared IP, the
cluster can also be implemented with dynamic DNS entries (for geographically distributed clusters).

Monday, October 09, 2006

ha update

Working like a dog lately..

mysql-ha has seen alpha-6.1 recently, with a revamped configuration script, supporting new backends (dialog for this release, and X for the next).

I'm learning wxPythong as fast as I can :)

I'm also tidying up the code for a new service availability and clustering project I'll make available soon (GPL too, of course).

The new configuration backends are not required for a mysql-ha installations. I've added a configuration-wrapper.sh script, which tests for the availability of a given backend (dialog or python && wxPython) and fires up a script according to the best available backend. If none is found, the original configuration-menu.sh is run.

As usual, we need more testing. I know people are reading the list because the project is being downloaded so please let me know the problems you run into!.

Regards,