The mighty Polysemizer, making sure all the words are widely known.

1. polysemy -- (the ambiguity of an individual word or phrase that can be used (in different contexts) to express two or more different meanings) (from WordNet.

A polysemy count indicates how common a word is. Specifically, polysemy refers to how many different forms of a word are used in a language; "love" has a polysemy count of 6, because you can love cooking and children, make love, score 9 love in a game of squash, and so forth. Polysemy values are analogous to the "familiarity," and, for most words, familiarity and polysemy values are practically the same. Familiarity, however, is uncovered by analyzing large corpuses of text and counting how often words occur. Polysemy comes out of what I suppose you could call "intralinguistic" analysis.

So, what I want is a Polysemizer, something to take a given amount of text and tag the words by their polysemy values, and show the result in a coherent way. For instance, the more common words could be shown in black, the less common in shades of gray. This way, when I'm writing advertising or some other text that needs to be easy to understand for the widest range of people of many ages, I can make sure that my vocabulary stays simple, concrete, and easy to fathom, avoiding words which might be perfectly normal to me but odd to others. It would also be helpful with other writing tasks.

The overall tool would be fairly simple to build, since polysemy isn't dependent on knowing the correct part of speech (in most cases; I guess homonyms are a problem): just tokenize a text and look up the polysemy count via the Lingua::Wordnet Perl module, transform that into a color value per word, and display as HTML. I think it could be built in a day or two.




Ftrain.com is the website of Paul Ford and his pseudonyms. It is showing its age. I'm rewriting the code but it's taking some time.


There is a Facebook group.


You will regret following me on Twitter here.


Enter your email address:

A TinyLetter Email Newsletter

About the author: I've been running this website from 1997. For a living I write stories and essays, program computers, edit things, and help people launch online publications. (LinkedIn). I wrote a novel. I was an editor at Harper's Magazine for five years; then I was a Contributing Editor; now I am a free agent. I was also on NPR's All Things Considered for a while. I still write for The Morning News, and some other places.

If you have any questions for me, I am very accessible by email. You can email me at ford@ftrain.com and ask me things and I will try to answer. Especially if you want to clarify something or write something critical. I am glad to clarify things so that you can disagree more effectively.


Syndicate: RSS1.0, RSS2.0
Links: RSS1.0, RSS2.0


© 1974-2011 Paul Ford


@20, by Paul Ford. Not any kind of eulogy, thanks. And no header image, either. (October 15)

Recent Offsite Work: Code and Prose. As a hobby I write. (January 14)

Rotary Dial. (August 21)

10 Timeframes. (June 20)

Facebook and Instagram: When Your Favorite App Sells Out. (April 10)

Why I Am Leaving the People of the Red Valley. (April 7)

Welcome to the Company. (September 21)

“Facebook and the Epiphanator: An End to Endings?”. Forgot to tell you about this. (July 20)

“The Age of Mechanical Reproduction”. An essay for TheMorningNews.org. (July 11)

Woods+. People call me a lot and say: What is this new thing? You're a nerd. Explain it immediately. (July 10)

Reading Tonight. Reading! (May 25)

Recorded Entertainment #2, by Paul Ford. (May 18)

Recorded Entertainment #1, by Paul Ford. (May 17)

Nanolaw with Daughter. Why privacy mattered. (May 16)

0h30m w/Photoshop, by Paul Ford. It's immediately clear to me now that I'm writing again that I need to come up with some new forms in order to have fun here—so that I can get a rhythm and know what I'm doing. One thing that works for me are time limits; pencils up, pencils down. So: Fridays, write for 30 minutes; edit for 20 minutes max; and go whip up some images if necessary, like the big crappy hand below that's all meaningful and evocative because it's retro and zoomed-in. Post it, and leave it alone. Can I do that every Friday? Yes! Will I? Maybe! But I crave that simple continuity. For today, for absolutely no reason other than that it came unbidden into my brain, the subject will be Photoshop. (Do we have a process? We have a process. It is 11:39 and...) (May 13)

That Shaggy Feeling. Soon, orphans. (May 12)

Antilunchism, by Paul Ford. Snack trams. (May 11)

Tickler File Forever, by Paul Ford. I'll have no one to blame but future me. (May 10)

Time's Inverted Index, by Paul Ford. (1) When robots write history we can get in trouble with our past selves. (2) Search-generated, "false" chrestomathies and the historical fallacy. (May 9)

Bantha Tracks. (May 5)

Tables of Contents