News from the Edge

The care and feeding of a taxonomy

Posted by Wes Fleming on Apr 28, 2016 11:12:01 AM

Find me on:

Blog_28_tomato_plant.jpegMy kid plants tomatoes every spring.  She waters them, gives them some kind of fertilizer, makes sure they get plenty of sunshine and in turn, the plants produce dozens of little bite-sized tomatoes for her enjoyment.  I’m always amazed at how little effort it takes to get all those tiny tomatoes.

Like tomatoes, taxonomies require attention. If a taxonomy is left to the weeds, it will lose its ability to return thorough, relevant search results over time as new terms enter the lexicon.

An example of this is the acronym HIX – Health Insurance Exchanges, a term itself which had little importance or prominence until the advent of the Affordable Care Act.  Once editors discovered how this new acronym was being used, it was a simple matter to add it to the collection of terms used to run various Health Care searches and thus improve those results.

Improving the taxonomy over time involves more than just identifying new terms. An important aspect of taxonomy improvements circles around disambiguation, especially when people are involved. Being able to understand the difference between Ferguson, Missouri and Bob Ferguson, the Attorney General of Washington is critically important in keeping the relevance of search returns as high as possible.  Similarly, knowing that there are two people named Will Smith in the news – one a high profile actor, the other a recently murdered ex-New Orleans Saints football player – is something a quality taxonomy will be able to sort out.

One of the things we pay attention to is the care and feeding of our taxonomy. We improve, refine and update category and entity lists to avoid as many ambiguity problems as possible.  We categorize about 400,000 new articles every day into roughly 4,500 taxonomic categories with a system that filters through and identifies millions of people, places, companies, brands and more.  Our goal is to revise 4% of our major (most used) filters every month, which provides a one-year half-life on our most important resources.

With an effective taxonomy that is constantly reviewed and regularly updated, the search for content can go on in its most efficient manner and provide the best return possible.

Download Finding the Newsworthy White Paper Now

photo credit: Tomato via photopin (license)

Topics: Content, taxonomy

Why read News from the Edge?

The NewsEdge Blog

NewsEdge, a service of Acquire Media, has been serving the information needs of busy professionals in corporate, finance and government for over 25 years.  We are experts in surfacing business relevant information through web, mobile and feed deliveries.  We specialize in content categorization and distribution to ensure users receive only the news they need, when they need it and how they need it.

Our blog aims to:

  • Discuss hot topics affecting the information industry
  • Offer our insights into new technologies - the good and the bad
  • Invite others to share viewpoints on how information is changing in real-world environments

Stay in Touch with Email Updates

Recent Posts