October 23, 2003

Sexed Text

The Gender Genie purports to determine the gender of a writer given a body of text. The application is based on an algorithm developed by Moshe Koppel of Bar-Ilan University in Israel and Shlomo Argamon of the Illinois Institute of Technology, who analyzed over 500 English-language works for gender-based word patterns.

The New York Times discusses in Sexed Texts; Also see Computer program detects author gender.

The gender genie thinks I am a female, I am not. I am not particularly impressed with the gender genie after running 5 or 6 texts through it.

Posted by: josh at Oct 24, 2003 5:19:04 PM

I ran through four different things I had written. Apparently I am indisputably a male. Alas, my colleagues would dispute that based on certain physical attributes.

Posted by: Cath at Oct 24, 2003 8:48:34 PM

Like the previous people the gender genie thinks I'm male. I ran 4 samples of my writing throught it then I ran a sample of writing written my a famous female author. The gender genie thought she was male as well.

Posted by: melanie at Oct 25, 2003 8:08:00 PM

I'd be very curious to know the accuracy amongst different subpopulations. I ran my personal blog through it: female. My academic blog: male. My academic writings: uber male. My random notes to friends: female.

I think i have a complex or else the system is reminding me that more socially "appropriate" material is definitely male. Just an ounce of emotions and bam! female kicks in.

Posted by: danah boyd at Oct 26, 2003 1:02:38 PM

especially telling is the stat at the bottom of the page where it shows barely better than break-even correct guessing for the time-period of aug 15th to sept 13th. if you look at the tables in the analysis page it also tends to break down impersonal words and relative words into male and female respectively too.
overall it's a dumb trick which doesn't try for advanced text-analysis features, so what can be expected?
--random noises

Posted by: anonymous at Oct 26, 2003 2:13:32 PM

It's no surprise that academic writings come up male, and personal blogs, female -
the algorithm looks at the incidence of personal pronouns among other things - and obviously in academic-speak that incidence is almost nil. The same thing happened to me, comparing blog posts with thesis chunks. It's not really very sophisticated, and can I just say that the choice of terminology being "male and female" rather than "masculine and feminine" relaly, really, pisses me off - male and female being biological categories, and masculine and feminine being cultural ones and therefore far more applicable.

Posted by: jean at Oct 31, 2003 4:54:07 AM