On Vox: Fedora 12, Dracut, dmraid, mdadm, oh my!

It appears that Fedora 12 moved to a new boot init system called dracut.  Sadly due to a number of odd circumstances this has caused me much pain.  Here's my basic config

  • /boot and /  on /dev/sda
  • /var and /home on a partitioned software raid on /dev/sd{cd}

After an yum-based upgrade to Fedora 12 I rebooted.  We get to the point where we initialize the software raid and boom.  failure.  I'd seen this before, partitioned raid has always had some trouble in fedora.  Previously I had to modify the rc.sysinit script to reset the raid partitions, so I tried that again, moving that init to later in the boot sequence.  Reboot and yes, it works..

However then I noticed some odd things.  I was only getting a single drive in my mirrored RAID.  Further investigation revealed that I had a device dm-1 instead of sdc or sdd listed in /proc/mdstat...  Uh oh..

Looking more closely it appears that my drives were getting set up by dmraid as a fake-raid mirror:  

# dmraid -r 
/dev/sdd: sil, "sil_aiabafajfgba", mirror, ok, 488395120 sectors, data@ 0
/dev/sdc: sil, "sil_aiabafajfgba", mirror, ok, 488395120 sectors, data@ 0

I tried adding the nodmraid option to grub.conf but then the new dracut system started an infinite spew of messages generated by this mdadm error message string (lifted from Assemble.c)

fprintf(stderr, Name ": WARNING %s and %s appear"
" to have very similar superblocks.\n"
" If they are really different, "
"please --zero the superblock on one\n"
" If they are the same or overlap,"
" please remove one from %s.\n",
devices[best[i]].devname, devname,
inargv ? "the list" :
"the\n DEVICE list in mdadm.conf"

Drats! the mirrored fake raid had already mangled my second drive by duplicating the superblock!  Plus since all this was going on in dracut I couldn't fix it.  So I removed the nodmraid option in grub during boot and dug a little deeper. I found that I could keep dracut from doing all this nonsense by adding the following kernel options:

rd_NO_MD rd_NO_DM nodmraid

This allows for a minimal boot without dmraid or mdadm.  After that I was dropped into single user mode with the dupe superblock message.  To fix this required zeroing the superblock of sdd

mdadm --zero-superblock /dev/sdd1

And then rebooting (again!)

Once past this things started working somewhat normally.  To get my raid mirrored again I did the normal thing:

# mdadm --manage /dev/md_d0 --add /dev/sdd1

To get rid of the false-positive fake raid setup I found that you can do this with the dmraid tool itself:

[root@mirth ~]# dmraid -E -r /dev/sdd

Do you really want to erase "sil" ondisk metadata on /dev/sdd ? [y/n] :y

[root@mirth ~]# dmraid -E -r /dev/sdc

Do you really want to erase "sil" ondisk metadata on /dev/sdc ? [y/n] :y

The really odd thing about this whole incident is that I never had these drives in a fake raid setup before. 

In any case, hope this helps the few other people who might have this same problem.

Originally posted on paul.vox.com

On Vox: Email Clients Full Circle

In the beginning I used elm to read my mail.  This was somewhat radical, especially as I worked with the team that created POPMail for the mac and Minuet for the PC, and everyone else moved to pine.  Then came Mutt -- happy days -- I was able to slice and dice email with amazing speed.

A couple of years ago I converted over to Mail.app -- mostly because of the contacts and calendar integrations, and the fact that I could merge personal email and corp email accounts.  In the intervening time I had to move to comcast, which meant running my own imap server proved more difficult than it was worth, so I moved to Google Apps for Your Domain, all of a sudden my personal domain is running Gmail, and I discovered it has key bindings.

All of a sudden it's mutt deja-vu. navigation with vi j/k keys? yes.  Single window view (inbox/message)? yes again.  Tagging messages? yes.  Blazingly fast? you bet.  The only thing I miss is keystroke filtering of messages.

That's one reason why I see things like Google Wave working out so well, I might be late to the gmail party, but plenty of folks have been using this as their primary mode of communication for a long long time.

Originally posted on paul.vox.com

On Vox: Tomcat and SSL Accelerators

Using an SSL Accelerator like a Netscaler is really useful, you can offload a lot of work to a device that supports this in hardware and can use SSL session affinity to send requests to the same backend.  In the simplest setup the SSL Accelerator accepts the request and proxies it to your internal set of hosts running on port 80.

However, code that generates redirects and URLs works poorly because the servletRequest.getScheme(), getSecure() and getServerPort() will return http/false/80 for SSL and non-SSL connections.

One way to solve this is listen on multiple ports.  Create a Connection on 80 and 443, but do not run SSL on either port.  Then for the 443 port you configure it with secure="true" and scheme="https".  This is suboptimal however as then you have to manage yet another server pool in your load balancer and you end up sending twice the health checks.  Not so good.

You might try to solve this by using a ServletFilter.  You can use an HttpServletRequestWrapper instance to change the scheme/port/and secure flag.  Sadly this doesn't work, because of the way tomcat implements HttpServletResponse, it uses the original request object to ascertain the scheme/secure flag/port.  Overriding these will allow application logic to see the updated values.  You get into trouble when you call encodeRedirectURL() or sendRedirect() with non-absolute URLs.

Lucky for us Tomcat supports a way to inject code into the connection handling phase via Valves.  A valve can query and alter the Catalina and Coyote request objects before the first filter is run.  

To make your Valve work you'll need to configure your load balancer to send a special header when SSL is in use.  On the Netscaler this can be done by setting owa_support on.  With that enabled the http header Front-End-Https: On is sent for requests that use SSL.

Once we have these pieces in place the Valve is fairly straightforward:

import java.io.IOException;

import javax.servlet.ServletException;

import org.apache.catalina.connector.Request;
import org.apache.catalina.connector.Response;
import org.apache.catalina.valves.ValveBase;

public class NetscalerSSLValve extends ValveBase {

        public void invoke(Request req, Response resp) throws IOException, ServletException {
                if ("On".equals(req.getHeader("Front-End-Https"))) {
                if ( getNext() != null ) {
                        getNext().invoke(req, resp);

Compile this, stick it in the tomcat lib directory, add an entry in your server.xml and away you go.

Originally posted on paul.vox.com

On Vox: The Mysteries of Java Character Set Performance

"Two Characters Sets?  Seems like plenty!"

So I've been pushing Java to it's limits lately and finding some real nasty concurrency issues inside the JRE code itself.  Here's one particulary ugly one -- we had 700 threads stuck here:

       java.lang.Thread.State: BLOCKED (on object monitor)                                                                    
         at sun.nio.cs.FastCharsetProvider.charsetForName(FastCharsetProvider.java:118)
         - waiting to lock <0x00002aab4cdf91b8> (a sun.nio.cs.StandardCharsets)
         at java.nio.charset.Charset.lookup2(Charset.java:450) 
         at java.nio.charset.Charset.lookup(Charset.java:438)
         at java.nio.charset.Charset.isSupported(Charset.java:480) 
         at java.lang.StringCoding.lookupCharset(StringCoding.java:85) 
         at java.lang.StringCoding.decode(StringCoding.java:165)                                                                      
         at java.lang.String.<init>(String.java:516) 

Digging deeper we find the lookupCharset is called all over the place.  The app in question is functions as a web proxy, so it's constantly reading and writing data from web pages in a variety of character sets.  The method charsetForName() uses a synchronized data structure to lookup defined character sets.  (Yay serialized access....)

But wait, lookup and lookup2 provide us with a cache so we can avoid the big bad synchronized method..  Sigh, here's the implementation:

     private static Charset lookup(String charsetName) {
         if (charsetName == null)
             throw new IllegalArgumentException("Null charset name");
         Object[] a;
         if ((a = cache1) != null && charsetName.equals(a[0]))
             return (Charset)a[1];
         // We expect most programs to use one Charset repeatedly.
         // We convey a hint to this effect to the VM by putting the
         // level 1 cache miss code in a separate method.
         return lookup2(charsetName);
     private static Charset lookup2(String charsetName) {
         Object[] a;
         if ((a = cache2) != null && charsetName.equals(a[0])) {
             cache2 = cache1;
             cache1 = a;
             return (Charset)a[1];
         Charset cs;
         if ((cs = standardProvider.charsetForName(charsetName)) != null ||
             (cs = lookupExtendedCharset(charsetName))           != null ||
             (cs = lookupViaProviders(charsetName))              != null)
             cache(charsetName, cs);
             return cs;
         /* Only need to check the name if we didn't find a charset for it */
         return null;

Yes, a whopping 2-entry cache!!

Also, the keys used are not canonical, so if my app asks for "UTF-8", "utf-8", and "ISO-8859-1" with regularity this 2 entry cache is worthless, every call ends up blocking in the evil thread-synchronized data structure.

Someone send them a copy of the ConcurrentHashMap doc.  please.

</rant> ....

Originally posted on paul.vox.com

On Vox: OpenSocial Roundup

 At hi5 we've been busy busy busy getting OpenSocial up and running.  We released our developer sandbox, and are rapidly implementing features.  So check out the following URLs

Also, here's a copy of my response to Tim O'Reilly's blog post:

OpenSocial: It's the data, stupid

Hi folks,

Good comments all around. However I'd like to posit that data access is _not_ the problem. We've had universal standards for years now with little uptake. Tribe.net, Typepad, LiveJournal and others have supported FOAF for many, many years, which encompasses the OpenSocial Person and Friends APIs. Not much has come of that -- there isn't a large enough base there to get people interested.

Now you have a broad industry consensus on a single way to provide all of the above plus activity stream data. You have a rich client platform that allows you to crack open that data and use it in interesting ways, and finally you have a common standard for social networks to interact with each other based on the REST api.

So Patrick's statement at the Web 2.0 Expo is correct, a app running inside a container only allows you to see what that container shows you. However that does not mean that a container could not contain friend references to external social networks via it's own federation mechanism. Movable Type 4.0 has shown that you can support any OpenID login in a single system, there's no reason to believe that social networks could not leverage OAuth to do the same.

And here's a final point to consider -- you have Myspace opening up to developers. That's huge. That alone is going to draw more developer attention to this problem than much of the oh-so academic discussions of the past few years.

I suggest people that _want_ OpenSocial to solve all the social graph ills get involved on the API mailing list and make sure that those elements are addressed as OpenSocial evolves.

There's a tremendous amount of momentum. Let's not waste this chance.

Originally posted on paul.vox.com

On Vox: Suggestions

This has got to be a bug....

Dear Amazon.com Customer,

We've noticed that customers who have purchased or rated White Noise Critical: Text and Criticism (Viking Critical Library) by Don DeLillo have also purchased Caught in the Machinery: Workplace Accidents and Injured Workers in Nineteenth-Century Britain by Jamie Bronstein. For this reason, you might like to know that Caught in the Machinery: Workplace Accidents and Injured Workers in Nineteenth-Century Britain will be released on October 10, 2007.  You can pre-order yours by following the link below.

Caught in the Machinery: Workplace Accidents and Injured Workers in Nineteenth-Century Britain
Jamie Bronstein
Price:    $55.00
Release Date: October 10, 2007

Originally posted on paul.vox.com

On Vox: Windows Live API, better than page scraping

So we use this toolkit from Octazen to scrape contact lists off of various sites.  Our ever eager users (ab)used this feature so much that hotmail blocked us.

So I waded through reams of API docs over at http://dev.live.com and finally came up with this prototype perl script to talk to their API servers.  Gets the job done for now.  Will want to rewrite it in native Java and add decent error handling soon.  Hopefully this post here will help other folks needing to talk to Live API:

# Output tab separated values for a given hotmail username/password
# Implementation of Windows Live Contacts API
#    http://msdn2.microsoft.com/en-us/library/bb463974.aspx
# Uses RPS authentication described here:
#    http://msdn2.microsoft.com/en-us/library/bb447721.aspx

use HTTP::Request;
use LWP::UserAgent;

my $username = shift || die "Need a username\n";
my $password = shift || die "Need a password\n";
my $apikey = 'YOURAPIKEY';

my $ua = LWP::UserAgent->new;

my $uri = 'https://dev.login.live.com/wstlogin.srf';
my $req = HTTP::Request->new(POST => $uri);
my $xml = <<EOF
    xmlns:s = "http://www.w3.org/2003/05/soap-envelope"
    xmlns:wsse = "http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-secext-1.0.xsd"
    xmlns:saml = "urn:oasis:names:tc:SAML:1.0:assertion"
    xmlns:wsp = "http://schemas.xmlsoap.org/ws/2004/09/policy"
    xmlns:wsu = "http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-utility-1.0.xsd"
    xmlns:wsa = "http://www.w3.org/2005/08/addressing"
    xmlns:wssc = "http://schemas.xmlsoap.org/ws/2005/02/sc"
    xmlns:wst = "http://schemas.xmlsoap.org/ws/2005/02/trust">
        <wlid:ClientInfo xmlns:wlid = "http://schemas.microsoft.com/wlid">
        <wsa:Action s:mustUnderstand = "1">http://schemas.xmlsoap.org/ws/2005/02/trust/RST/Issue</wsa:Action>
        <wsa:To s:mustUnderstand = "1">https://dev.login.live.com/wstlogin.srf</wsa:To>
            <wsse:UsernameToken wsu:Id = "user">
        <wst:RequestSecurityToken Id = "RST0">
            <wsp:PolicyReference URI = "MBI"></wsp:PolicyReference>

$req->content_length(length $xml);

my $res = $ua->request($req);

# Ugly way of hacking out the binarysecuritytoken
my $resultxml = $res->content();
$resultxml =~ m,<wsse:BinarySecurityToken[^>]*>(.*)</wsse:BinarySecurityToken>,si;
my $binarytoken = $1;

# Request contacts
my $contactsurl = "https://cumulus.services.live.com/$username/LiveContacts/Contacts";
my $authheader = 'WLID1.0 t="' . $binarytoken . '"';
my $contactsreq = HTTP::Request->new(GET => $contactsurl, ['Authorization' => $authheader]);

my $contactres = $ua->request($contactsreq);
my $contactxml = $contactres->content();

use XML::Simple;
my $result = XMLin($contactxml, 'ForceArray' => ['Email', 'Contact']);

# parse emails
foreach my $c (@{$result->{'Contact'}}) {
    my $fname = $c->{Profiles}->{Personal}->{FirstName};
    my $lname = $c->{Profiles}->{Personal}->{LastName};
    foreach my $a (@{$c->{Emails}->{Email}}) {
        print "$fname\t$lname\t" . $a->{Address} . "\n";

Originally posted on paul.vox.com