With just a few lines of code you can load and parse microformats from Urls or HTML strings. You can then exact the data directly in .Net or convert it into JSON, JSON-P or XML.
using UfXtract; string url = "http://www.glennjones.net/about/"; UfWebRequest webRequest = new UfWebRequest(); webRequest.Load(url, UfFormats.HCard()); if (webRequest.Data.Nodes.Count > 0) { UfDataToJson dataConvertor = new UfDataToJson(); Response.ContentType = "application/json"; Response.Write(dataConvertor.Convert(webRequest.Data, UfFormats.HCard())); }
UfXtract supports a number of microformats hCard, hCalendar, hReview, hResume, hAtom, XFN, rel-tag, geo, adr, rel-nofollow, rel-license, rel-directory, rel-home, rel-enclosure, rel-payement and votelinks. It also supports a handful of POSH patterns hCard-XFN, rel-me, rel-next/previous, test-suite and test-fixture. The direct support of rel-me and rel-next/previous was added to help people build social graph spiders.
To test that UfXtract correctly parses the major microformats I have created a microformats test suite which has input from a number of other parser authors. These HTML documents are marked up with test-suite and test-fixture patterns. This allows UfXtract to auto generate NUnit test classes. The UfXtract includes a console application called UfXtractUnitTestBuilder which will update the test directly from the web.
Please feel free to fork the code and pass back any contribution you would like me to add.
© 2007-2010 Glenn Jones. All Rights Reserved.