Wednesday, December 27, 2006

Celebrating new year with our first beta

I'm celebrating the new year with the first mysql-ha beta release.

I've split the releases in two packages:
  • mysql-ha, which continues with the release numbering we've used until now, and includes only the code needed in order to run the cluster. You must install and configure the cluster manually, following this guide. This package is now on beta.
  • installer, which is still alpha, and is numbered from 0.1.0 again.

I hope that this split will encourage more people to try out the cluster, by isolating the code that had more bugs and caused more troubles (the automatic installer) from the almost stable cluster code.

Friday, December 22, 2006

plan9

My first plan9 system is installed.

The installation went smoothly on a somewhat old system (Celeron w/96 MB RAM).
I tried running it on an even older system (you can install it on 32 MB RAM) but it didn't
detect the videocard properly.

You can just follow the instructions from the installer and you'll be up and running in short time
(little less than two hours in that hardware, but the disk is really slow).

I was amazed at the fast boot procedure, but reading the docs, it just loads a very minimal kernel
(read this) and you can load additional servers on demand.

The main ideas about plan9 is that everything is a file.
If you're like me, you might say 'I thought everything was a file in UNIX already'
Go read that link where they explain what's wrong with UNIX's model of 'everything is a file', and how it's broken.

As I said, everything is indeed a file in plan9, and files can be accessed regardless of their location (local/net).

If I'm not making much sense yet, it's because I'm still figuring this out myself, and because I haven't got much sleep in the last two days (You gotta be interested in OS design in order to try plan9, but you really gotta be interested in os design in order to try a new OS while a 5 month old baby is on the house!).

Still, I managed to take a screenshot just to document my first system, and I already broke it, too. I can't connect from my gnu/linux box anymore. Apparently I fucked up keyfs, which is good, really; I learned many things about UNIX by fucking things up too..

Here you can see me surfing the mysql-ha site using abaco, one of plan9's browsers.
As you can see, the goal of the system is not to be used as a desktop (though several people at bell labs use it as their only system..). In fact, the recommended way to browse the web is to vnc to another (non plan9) node and use that browser.

However, I'm not interested in using this as a desktop, but rather as a platform to develop service monitoring solutions, and as a way to learn more about OS design.

Don't worry, though, I won't discontinue mysql-ha anytime soon :)

Wednesday, December 20, 2006

cpu fan cooler issues, and my plan B...

I'm about to take a short break for the holidays, but this last few days have been extra slow due to continuous issues with the server which hosts the mysql-ha project.

Last night the cpu fan cooler died on us, and we're using a borrowed (we're that much in need of hardware! hehe) one until tomorrow.

This week I intend to get the site back to stable, with a geographically distant (3 km) backup node in case anything goes wrong with the server. (One node is in my dad's house, the other in our office, but 'geographically distant' sounds fancier, right?)

Our network link is also to blame for a lot of our issues but there's not much we can do about that, unless you can figure out a way to break a state owned monopoly!

Anyway, as you can see, right know work is focused on mundane sysadmin tasks (but a few nice scripts and maybe another project might come out of this), but by next weekend I intend to have what will probably be the last release of 2006, and hopefully the last alpha.........

BTW, I'm making good use of my (very little) spare time, so while running some backups on our server, I'm installing plan9 in our older server (we had to take that box out of duty a couple of months ago due to the heavy traffic new interest in the project produced!).
I haven't seen much of it so far, but I have a feeling I'll be getting a crush real soon now..

I intend to document my experience with this system on this blog too, so I hope the seven people that read me regularly (I read my stats!) find this interesting too.

Happy Hacking!

Monday, December 04, 2006

alpha-0.6.5

This post is long overdue.
alpha-0.6.5 has been out for almost two weeks (despite the wrong release date in our server, which I'll fix soon enough).

From the promises made on 0.6.4, we have:
- sudo
- ssh-agent
What we don't have (yet) is heavy testing under CentOS 4, but we're starting with that today, so expect
lots of commits and maybe a couple of small releases this week.

I think we'll feature freeze here and focus on stabilty so we can get to beta ASAP.

Some testers would be fine :)

Kind regards, etc.

Tuesday, November 07, 2006

last changes | roadmap update

alpha-0.6.4 was the last release, though there might be another one out today.

This week started with some heavy testing for setup_replication.sh, and some bugs did come up.
The new version should work in more systems than before, and should be more stable. The bottom line: the chances to end up with replication working are greater now :)

Work is now focused on improving the core cluster code, with several bugs being ironed out from the slave routine and the takeover procedure. The rc-script has also been fixed, and even though it's 'design' is not something I'm proud off (I guess it's good to know I've learned a lot about writing init scripts in the past 4 years, hehe), it does work for Red Hat/Fedora in all scenarios now.

What will come next, in the short term:
  • use of the ssh agent in order to allow the use of passphrases for key based ssh
  • uso of sudo to run anything that requires root (service mysqld restart, ifconfig, fake, etc.)
  • Heavy testing under CentOS on our side, with a RHEL4 system that's being offered by a user. Let's hope good things come out of this too.

Monday, October 30, 2006

update | alpha-6.3-1

I've just released v. 6.3-1 and the aforementioned failover scenarios are now handled properly.

ToDo:

  • Isolate the install/config scripts so folks can use the cluster with manual configuration (the way it's done with heartbeat, for instance). Ideally, the code should be forked so bugs are handled independantly.
  • Rewrite the passwordless ssh setup so we use the ssh agent and therefore store the private key with a passphrase (at the end user's choice, of course)
  • Use sudo to run anything that needs root, so the cluster can run with an unprivileged account

Lots of problems, lots of fun!.

yay!

After two months of interrupted hard work (yes, interrupted), I got to handle two error situations almost flawlessly.

Almost :)

The first scenario was a simple mysqld stop on the master, which was properly handled by a remote mysqld start by the slave.

The second scenario was a hard failure on the master, which was properly handled by the slave (it took over the service), with the only added problem that it lost it's original IP address, becoming reachable only though the cluster IP.

Anyway, there's lots of hard work to do towards 1.0, but alpha-7 is much closer now, and we're really closer to saying 'happy clustering' again!.

Monday, October 23, 2006

documentation update draft

I'm working on updating the cluster documentation to reflect the latest changes (in our code and in MySQL).

Here are some general notes.

Replication privileges

In order to allow replication from the slave host (slavehost) we need only to run this sentence on the master host:

GRANT REPLICATION SLAVE ON db.table TO 'replicationuser'@'slavehost'

mysql-ha creates a user with just this privilege, and it's recommended that you don't grant this user any more privileges than those needed (REPLICATION SLAVE on any db.table combination you want to replicate).


mysql-ha limitations and known issues:

  • geographic distribution
Right now the cluster is based on sharing an IP address. This technique works if both master and slave node are on the same physical network. We need to modularize this code so that we can share a network resource instead, this resource being either an IP address, a dynamic DNS entry, etc.
  • remote execution security
Right now, remote script execution is based on passwordless remote ssh. In order to allow this, we set up pubkey/privkey based ssh without a passphrase. This is an obvious security issue. We need to use the ssh agent in order to use a passphrase to protect the private key. This should be provided as an option to the end user.
  • ARP spoofing
The cluster uses ARP spoofing only if the failover can't be forced on the master node. ARP spoofing is generally ignored by routers but is a normal technique used by clusters (heartbeat uses it by default to speed up the propagation of an IP address change). We should allow the customization of this with
three options:
    • no spoofing
    • spoof only when needed (as is done now)
    • spoof always (as heartbeat)
Please note that ARP spoofing is only needed if the cluster uses a shared IP address.

Another issue is that we execute remote commands using the root account. I'm currently working on updating sudo, so we no longer need the root password on the master/slave node and remote commands can be executed by non-privileged accounts.

More info on this later..

Thursday, October 19, 2006

alpha-7 roadmap

alpha-6.2 was the last relase from the 6 series, and completed a wxpython based cluster
configurator.

New releases on this series will just include fixes to existing features.

I'm working hard on alpha-7, fixing issues with the installation procedure, with the ultimate goal of achieving a smooth install and a working takover on FC5.

I noticed that the configuration-wrapper.sh script doesn't always fall back correctly, and there are
probably many more bugs introduced in the recent developments.

However, I hope to make again a usable release (I believe the last was alpha-5, with FC2) for alpha-7 RSN. Like.. next week or something :)

Roadmap:
- update the existing docs
- create documentation for a manual installation
- modularize the network resource sharing code, so that instead of using just a shared IP, the
cluster can also be implemented with dynamic DNS entries (for geographically distributed clusters).

Monday, October 09, 2006

ha update

Working like a dog lately..

mysql-ha has seen alpha-6.1 recently, with a revamped configuration script, supporting new backends (dialog for this release, and X for the next).

I'm learning wxPythong as fast as I can :)

I'm also tidying up the code for a new service availability and clustering project I'll make available soon (GPL too, of course).

The new configuration backends are not required for a mysql-ha installations. I've added a configuration-wrapper.sh script, which tests for the availability of a given backend (dialog or python && wxPython) and fires up a script according to the best available backend. If none is found, the original configuration-menu.sh is run.

As usual, we need more testing. I know people are reading the list because the project is being downloaded so please let me know the problems you run into!.

Regards,

Monday, September 25, 2006

mysql-ha alpha-6

I've released alpha-6 of mysql-ha just a few minutes ago.
This release includes a working version of setup_replication.sh, tested only on Fedora Core 5.
This script asks a few questions (mostly passwords and paths) and properly sets up MySQL version 5 replication on the mentioned distribution.

I'm in need of testers willing to invest some time and run the script on different distros, so we can be
as distribution agnostic as possible.

Work will now focus on alpha-7, which will include the takeover/failover mechanisms tested on Fedora Core 5.

Check the release at http://www.seriema-systems.com/mysql-ha/index.php?page=downloads
or just go straight to https://svn.sourceforge.net/svnroot/mysql-ha/trunk/mysql-ha/setup_replication.sh
if you're only interested on testing the script.

Wednesday, September 20, 2006

The media center adventure...

I'm suspending work on mysql-ha until friday, because I need to get some progress on the media center demo machine.

So far, I've tried freevo (great if you want to build a 'black box' machine) and improvements on regular software (like the xmms cdcover plugin).

As usual, my test system is a FC5 box with no tv tuner now, but hopefully a linux-supported pinnacle
board will be on next week.

In order to get freevo running on FC5 you need to:

Configure the repositories

freevo.repo:

[freevo]
name=Freevo RPM Repository for Fedora Core
baseurl=http://freevo.sf.net/fedora/$releasever
gpgkey=http://freevo.sourceforge.net/fedora/tcwan_freevo_key.asc
enabled=1
gpgcheck=0

dries and/or (I'm not sure yet!) freshrpms for dependencies:

[dries]
name=RPMForge: Dries
mirrorlist=http://apt.sw.be/dries/fedora/fc$releasever/mirrors-rpmforge
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY.dries.txt
gpgcheck=0
enabled=1

[freshrpms]
name=RPMForge: Freshrpms
baseurl=http://ayo.freshrpms.net/fedora/linux/$releasever/$basearch/freshrpms/
mirrorlist=http://ayo.freshrpms.net/fedora/linux/$releasever/mirrors-freshrpms
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-freshrpms
gpgcheck=0
enabled=1

Get freevo

At this point, yum install freevo will work, but it won't get some of the dependencies. To be fair, it
will get anything you need to actually run the thing, but, for instance, if you don't have mplayer or xine, it won't be installed.

Running freevo


For the first time, you need to run freevo setup. This will create a ~/.freevo/freevo.conf file, which you
shouldn't mess around with since it's automatically generated. Instead, if you want to alters freevo's behaviour, you should write a ~/.freevo/local_conf.py file. In fact, you need this file before you can run freevo, with at least a line containing CONFIG_VERSION = 5.15

Now you're ready to run freevo, which can start as an X application, or can be started from a console session with the -fs option, which will start another X session and run freevo on it.

Things I've tried

The audio.coversearch plugin, which uses Amazon's web services (you'll need a developer account, and they're free) in order to fetch the cover of the CD you're currently playing. The downside is you need to press 'e' while listening to the CD to get a menu that will let you fetch the cd cover. I'm working on changing the default behaviour so that the cover is fetched automatically.

The headlines plugin, which I couldn't get to work with google news (other rss feeds worked OK so maybe I'm doing something wrong).

You'll probably need to manually specify the CD-ROM drive(s) by setting the ROM_DRIVES variable in the config file.

Here's my entry, as an example:
ROM_DRIVES = [ ('/media/cdrom', '/dev/hdc', 'CD') ]

If that doesn't look like a typical config setting, take into account that this is a py file after all, so freevo will be configured by running this file through the python interpreter (hence, the file contents must be valid python code).

So far I'm still messing around with it and I still have a lot of work to do, but I like the architecture very much. Python code is easily hackable (I had nice experiences with anaconda in the past) and everything here is written as a plugin, which you can enable/disable at will.

There's a wiki at the freevo site (freevo.sf.net) and it includes documentation on writing your own plugins.

More news on this tomorrow or friday.

Regards,

Monday, September 18, 2006

setup_replication.sh and home media centers

I'm really close to getting serup_replication.sh to run smoothly on FC5. In fact, it already installs with no
errors, but replication doesn't work afterwards.

I'm in big need for testers so if you have a couple of spare boxes (virtual or otherwise) and some minutes, jump to https://svn.sourceforge.net/svnroot/mysql-ha/trunk/mysql-ha/setup_replication.sh and get the
script to run on your system.

If you're on another distro, please test it anyway (you might encounter many path issues but not much more, I hope..). While my current focus is getting it to work on FC5, mysql-ha has traditionally worked on Debian too.

I'm also working hard to get a gnu/linux based home media center for an exhibition coming next month.
The current setup of choice is FC5 (was ubuntu, but I couldn't get freevo to work there) with freevo, and I'll try mythtv too.

I plan to post an entry with detailed installation instructions once I get this thing going.

Tuesday, September 12, 2006

mysql-ha revival

I've started to work on the mysql-ha project once again. This project provides a highly available database server using MySQL.

Right now, I'm working on automating the replication setup process, according to the instructions provided in the 5.0 manual.

Tasks pending for the next release:
  • get a list of the master's databases in order to populate the my.cnf file's replicate-do-db entries
  • backup all files before applying changes made by the script
  • get replication to start automatically after the changes have been applied

While the release is made, you can check out the code from CVS and, soon (at most in 24 hours) from SVN too.

If anyone is willing to test this please comment this post.

Regards,

Sunday, September 10, 2006

Fedora Core on the Presario V2617LA


Recently, I became the happy owner of a Compaq Presario V2000, model V2617LA.
This is a reasonable good notebook, particularly regarding it's price, and my distribution of choice (Fedora Core) installs smoothly on it.

Every device I use works properly under Linux (I haven't checked the modem, and I don't think I'll be doing that anytime soon), with the drivers included in the distro, except for the wireless card, which needs ndiswrapper and the Windows driver.

What's included

The machine comes with Windows XP Home edition, for which I was forced to purchase a license that doesn't even include a CD. You're entitled to a rescue media set, on your choice of CD or DVD, but if you're interested in keeping Windows, take good care of that media because you can only create the set
once.

Fortunately, I wasn't interested in this so I proceeded to boot off the FC5 cdrom and wipe XP out of the hard disk.

Installing Fedora Core

You can just hit enter on the isolinux boot screen, since the setup procedure properly detects and configure the display and video adapter (Proprietary ATI drivers are needed later, in case you want to make use of the board's 3D features).

The installation process is smooth, and includes detection and configuration of the ethernet board.

Once it's over, the screen resolution is set to 800x600, which is awful for the widescreen display.
However, as I said, the board and display are detected properly, so all you have to do is edit /etc/X11/xorg.conf and manually add the 1280x768 and 1024x768 (in case you want it) modes to the Screen section.

In order to get the wireless LAN working, you'll need the windows drivers. These are available from the hp/compaq web site if, just like me, you forget to back it up before wiping out XP (you'll need the
SP31463A exe).

I installed them with wine, which of course failed, but nevertheless uncompressed the files, which was my goal anyway.

Once available, you must install it with ndiswrapper, like this:

ndiswrapper -i bcmwl5.inf # installs the driver with ndiswrapper
ndiswrapper -l # verifies the driver installation process
modprobe ndiswrapper # loads the driver into the kernel
ndiswrapper -m # fixes /etc/modprobe.conf accordingly

You must also blacklist the open source bcm43xx driver, since it doesn't work properly on this board. In order to do this, add the line blacklist bcm43xx to /etc/modprobe.d/blacklists

If you can't connect to your AP, be sure to check that you've physically enabled the wifi card (the button to the left of 'power').

With this problem solved, you can safely say that FC5 is installed on the notebook.

Some issues:
  • pm-hibernate doesn't always detect the LCD-closed event.
  • The wifi card is physically disabled after hibernation and it must be manually turned on.
  • After several 'hibernations', the battery charge is reported wrong (this happened just once).
  • The sempron is configured to low speed even with the power on. I use cpuspeed to fix this.

All in all, the V2617LA is a good notebook for it's price, and runs gnu/linux smoothly and with just one proprietary driver (or two, if you really need 3D video).